The End of Tech Companies

September 20, 2016 by Rob Thomas

“If you aren’t genuinely pained by the risk involved in your strategic choices, it’s not much of a strategy.” — Reed Hastings

Enterprise software companies are facing unprecedented market pressure. With the emergence of cloud, digital, machine learning, and analytics (to name a few), the traditional business models, cash flows, and unit economics are under pressure. The results can be seen in some public stock prices (HDP, TDC, IMPV, etc.), and nearly everyone’s financials (flat to declining revenues in traditional spaces).

The results can also be seen in the number of private transactions occurring (Informatica, Qlik, etc.); it’s easier to change your business model outside of the public eye. In short, business models reliant on traditional distribution models, large dollar transactions, and human-intensive operations will remain under pressure.

Many ‘non-tech companies’ tell me, “thank goodness that is not the business we are in” or “technology changes too fast, I’m glad we are in a more traditional space”. These are false hopes. This fundamental shift is coming (or has already come) to every business and every industry, in every part of the world. It does not matter if you are a retailer, a manufacturer, a healthcare provider, an agricultural producer, or a pharma company. Your traditional distribution model, operational mechanics, and method of value creation will change in the next 5 years; you will either lead or be left behind.

It’s been said that we sit on the cusp of the next Industrial Revolution. Data, IoT, and software are replacing industrialization as the driving force of productivity and change. Look no further than the public markets; the 5 largest companies in the world by value are:

As Benedict Evans observed, “It is easier for software to enter other industries than for other industries to hire software people.” In the same vein, Naval Ravikant commented, “Competing without software is like competing without electricity.” The rise of the Data era, coupled with software and connected device sprawl, creates an opportunity for some companies to outperform others. Those who figure out how to apply this advantage will drive unprecedented wealth creation and comprise the new S&P 500.

This is the end of ‘tech companies’. The era of “tech companies” is over; there are only ‘companies’, steeped in technology, that will survive.

Read the rest on Medium here.

The 4th Dimension of Enterprise Software

September 02, 2016 by Rob Thomas

Charles and Miranda first met in art school in 1979. Over time, they realized a shared passion for handwork and the elegance of handmade objects for the home. Today, Charles Shackleton Furniture and Miranda Thomas Pottery, the workshops that comprise ShackletonThomas, consist of a group of individuals who share their philosophy.

Charles and Miranda think about 4 elements when creating an object:

1) Design- the shape, decoration, functionality, and style.

2) Materials- they select the best and most beautiful materials for design.

3) Craftsmanship- the precision, finesse, and functionality for how an object is put together.

4) The Fourth Dimension- “this is the element of design caused when the object is made by human hand or a tool directly controlled by human hand. All are imperfect, like the human that created it. But, the imperfections are beautiful.”

The fourth dimension is the crucial and final aspect that makes a piece of art truly great. “This is what gives life and soul to the inanimate object.”

***

Every incumbent player in the enterprise software market is facing a 4th dimension challenge. The first 3 dimensions are the nearly the same for everyone; it’s how they invest their R&D/SG&A across serving users, their existing clients/products, and a platform for the future.

Read the rest here.

A Practical Guide to Machine Learning: Understand, Differentiate, and Apply

July 22, 2016 by Rob Thomas

Co-authored by Jean-Francois Puget (@JFPuget)

Machine Learning represents the new frontier in analytics, and is the answer of how many companies can capitalize on the data opportunity. Machine Learning was first defined by Arthur Samuel in 1959 as a “Field of study that gives computers the ability to learn without being explicitly programmed.” Said another way, this is the automation of analytics, so that it can be applied at scale. What is highly manual today (think about an analyst combing thousand line spreadsheets), becomes automatic tomorrow (an easy button) through technology. If Machine Learning was first defined in 1959, why is this now the time to seize the opportunity? It’s the economics.

A relative graphic to explain:

Since the time that Machine Learning was defined and through the last decade, the application of Machine Learning was limited by the cost of compute and data acquisition/preparation. In fact, compute and data consumed the entirety of any budget for analytics which left zero investment for the real value driver: algorithms to drive actionable insights. In the last couple years, with cost of compute and data plummeting, machine learning is now available to anyone, for rapid application and exploitation.

***

It is well known that businesses must constantly adapt to changing conditions: competitors introduce new offerings, consumer habits evolve, and the economic and political environment change, etc. This is not new, but the velocity at which business conditions change is accelerating. This constantly accelerating pace of change places a new burden on technology solutions developed for a business.

Over the years, application developers moved from V shaped projects, with multi-year turnaround, to agile development methodologies ( turnaround in months, weeks, and often days). This has enabled businesses to adapt their application and services much more rapidly. For example:

a) A sales forecasting system for a retailer: The forecast must take into account today's market trends, not just those from last month. And, for real-time personalization, it must account for what happened as recently as 1 hour ago.

b) A product recommendation system for a stock broker: they must leverage current interests, trends, and movements, not just last months.

c) A personalized healthcare system: Offerings must be tailored to an individual and their unique circumstance. Healthcare devices, connected via The Internet of Things (IoT), can be used to collect data on human and machine behavior and interaction.

These scenarios, and others like them, create a unique opportunity for machine learning. Indeed, machine learning was designed to address the fluid nature of these problems.

Firstly, it moves application development from programming to training: instead of writing new code, the application developer trains the same application with new data. This is a fundamental shift in application development, because new, updated applications can be obtained automatically on a weekly, if not daily basis. This shift is at the core of the cognitive era in IT.

Secondly, machine learning enables the automated production of actionable insights where the data is (i.e. where business value is greatest). It is possible to build machine learning systems that learn from each user interaction, or from new data collected by an IoT device. These systems then produce output that takes into account the latest available data. This would not be possible with traditional IT development, even if agile methodologies were used.

***

While most companies get to the point of understanding machine learning, too few are turning this into action. They are either slowed down by concerns over their data assets or they attempt it one-time and then curtail efforts, claiming that the results were not interesting. These are common concerns and considerations, but they should be recognized as items that are easily surmounted, with the right approach.

First, let’s take data. A common trap is to believe that data is all that is needed for successful machine learning project. Data is essential, but machine learning requires more than data. Machine learning projects that start with a large amount of data, but lack a clear business goal or outcome, are likely to fail. Projects that start with little or no data, yet have a clear and measurable business goal are more likely to succeed. The business goal should dictate the collection of relevant data and also guide the development of machine learning models. This approach provides a mechanism for assessing the effectiveness of machine learning models.

The second trap in machine learning projects is to view it as a one-time event. Machine learning, by definition, is a continuous process and projects must be operated with that consideration.

Machine learning projects are often run as follows:

1) They start with data and a new business goal.

2) Data is prepared, because it wasn’t collected with the new business goal in mind.

3) Once prepared, machine learning algorithms are run on the data in order to produce a model.

4) The model is then evaluated on new, unforeseen, data to see whether it captured something sensible from the data. If it does, then it is deployed in a production environment where it is used to make predictions on new data.

While this typical approach is valuable, it is limited by the fact that the models learn only once. While you may have developed a great model, changing business conditions may make it irrelevant. For instance, assume machine learning is used to detect anomaly in credit card transactions. The model is created using years of past transactions and anomalies are fraudulent transactions. With a good data science team and the right algorithms, it is possible to obtain a fairly accurate model. This model can then be deployed in a payment system where it flags anomalies when it detects them. Transactions with anomalies are then rejected. This is effective in the short term, but clever criminals will soon recognize that their scam is detected. They will adapt, and they will find new ways to use stolen credit card information. The model will not detect these new ways because they were not present in the data that was used to produce it. As a result, the model effectiveness will drop.

The cure to avoid this performance degradation is to monitor the effectiveness of model predictions by comparing them with actuals. For instance, after some delay, a bank will know which transactions were fraudulent or not. Then it is possible to compare the actual fraudulent transactions with the anomalies detected by the machine learning model. From this comparison one can compute the accuracy of the predictions. One can then monitor this accuracy over time and watch for drops. When a drop happens, then it is time to refresh the machine learning model with more up to date data. This is what we call a feedback loop. See here:

With a feedback loop, the system learns continuously by monitoring the effectiveness of predictions and retraining when needed. Monitoring and using the resulting feedback are at the core of machine learning. This is no different than how humans perform a new task. We learn from our mistakes, adjust, and act. Machine learning is no different.

***

Companies that are convinced that machine learning should be a core component of their analytics journey need a tested and repeatable model: a methodology. Our experience working with countless clients has led us to devise a methodology that we call DataFirst. It is a step-by-step approach for machine learning success.

Phase 1: The Data Assessment
The objective is to understand your data assets and verify that all the data needed to meet the business goal for machine learning is available. If not, you can take action at that point, to bring in new sources of data (internal or external), to align with the stated goal.

Phase 2: The Workshop
The purpose of a workshop goal is to ensure alignment on the definition and scope of the machine learning project. We usually cover these topics:
- Level set on what machine learning can do and cannot do
- Agree on which data to use.
- Agree on the metric to be used results evaluation
- Explore how the machine learning workflow, especially deployment and feedback loop, would integrate with other IT systems and applications.

Phase 3: The Prototype
The prototype aims at showing machine learning value with actual data. It will also be used to assess performance and resources needed to run and operate a production ready machine learning system. When completed, the prototype is often key to secure a decision to develop a production ready system.

***

Leaders in the Data era will leverage their assets to develop superior machine learning and insight, driven from a dynamic corpus of data. A differentiated approach requires a methodical process and a focus on differentiation with a feedback loop. In the modern business environment, data is no longer an aspect of competitive advantage; it is the basis of competitive advantage.

iPad Pro: Going All-in

June 20, 2016 by Rob Thomas

Here is my tweet from a few weeks back:

I have given it a go, going all-in with the iPad Pro. In short, I believe I have discovered the future of personal computing. That being said, in order to do this, you truly have to change the way you work; how you spend your time, how you communicate, etc. But, it's worth it and will probably make you a better professional. I knew I was hooked, when I had to go back to my MacBook for something and I started touching the screen; the touch interface had been ingrained in my work.

Here are my quick observations:

1) The speed of the iPad Pro is unbelievable. While I didn't realize this in advance, this fact alone makes up for a lot of the reasons why I could never move to an iPad before.

2) You have to master multitasking in the iPad Pro in order to make the switch. There are a lot of shortcuts on the screen, keyboard shortcuts, and hand gestures. If you are not using them, you will not understand the advantage of this form factor.

3) Keyboard shortcuts are now available for my corporate mail. That's a big time saver.

4) I never have to worry about a power cable. The battery on this is great, but even if it gets low, nearly everyone I know has a compatible charger.

5) The integration of apps on the Pro is tremendous: Box/Office, Slack, etc.

6) It goes without saying that the Pro is super light and convenient for travel.

7) Here are some things I can't do on the iPad Pro:
- Renew Global Entry
- Corporate workflow (forms and expenses)
- Blogging (writing is easy, but posting to corporate blog or even Blogger is very hard). I'm not sure why there is not a good app for this.

8) I got the smaller version of the iPad Pro. I thought the large one was just too big. It seems like the ideal size may be a size in between the two.

In short, after a few weeks, I highly recommend. You can make the switch, but you'll likely need a laptop once a week or so, for some of the items mentioned above. I haven't really gotten into the Apple Pencil yet. I've used it a couple times and may try it more over the next couple weeks.

Data Science is a Team Sport

June 07, 2016 by Rob Thomas

In 2013, Ron Howard directed and released the movie Rush, a film that captured the rivalry between James Hunt and Niki Lauda during the 1976 Formula One racing season. It’s a vivid portrait of the drivers and their personalities—a pretty typical, if captivating focus on the drivers as heroes of the race. But it does something deeper and more interesting as well. The film looks into the essence of Formula One—a true team sport.

“Formula” in Formula One refers to the set of rules to which all participants' cars must conform. Formula One rules were agreed upon in 1946, on the heels of World War II. Modern Formula One cars are open cockpit, single-seat vehicles. The cornering speed of a car comes from “wings” mounted at the front and rear of the vehicle. The tires also play a major role in the cornering speed of a car. Carbon disc brakes are used to increase performance. Engines have evolved to turbocharged V6’s. All these components are integrated to provide precision and performance, and to win the race. However, the precision and design of the vehicle is useless, without the right team.

In Formula One, an “entrant” is the person who registers a car and driver for the race, and maintains the vehicle. The “constructor” is the person who builds the engine or chassis and owns the intellectual rights to the design. The “pit crew” is the team that prepares and maintains the vehicle before, during, and after the race. The cameras focus on the driver, with a couple of obligatory shots of the pit crew scrambling to change tires. But the real story is the collaboration of the complete team: experts working together to make the difference between success and failure.

***

Since the turn of the century, enterprises around the world have been on a journey to master data science and analytics. We have fewer camera crews, and no cool uniforms, but the goal is no less difficult to achieve. Said simply, we want the right information, at the right moment, to make better decisions. Despite years of effort, organizations have achieved inconsistent results. Some are building competitive moats with machine learning on a large corpus of data, but others are only reducing their costs by 3%, using some new tools. This is best viewed on an enterprise maturity curve:

Why are some organizations able to achieve differentiated results, while others struggle to set up a Hadoop cluster?

***

Spark is the Analytics Operating System for the modern enterprise. Anyone using data, starting right now, will be leveraging Spark. Spark enables universal access to data in an organization.

Today, we are announcing the Data Science Experience, the first enterprise app available for the Analytics Operating System. This is the first integrated development environment for real-time, high performance Analytics, designed to blend emerging data technologies and machine learning into existing architectures.

An IDE for data science is a collaborative environment; it brings data scientists together to make data science and machine learning available to everyone. Today, data science is an individual sport. If you are a data scientist at a retailer, for example, you have to choose your own tool or flavor, work on your own, and, with any luck, you produce a meaningful insight. Anything you learn stays with you—it’s self-contained, because it is built in your own lingua-franca.

Now, with the Data Science Experience, you can use any language you want—R, Python, Scala, etc.—and share your models with other data scientists in your organization.

We have made data science a team sport.

In Formula One parlance, Spark is the chassis, holding everything together. The Data Science Experience (the IDE) is the integrated components, acting as one, to drive precision and performance. And the data science discipline now has a driver, a pit crew, a constructor, and a coach, that incredible vehicle whose sum is greater than its parts: a team.

The Data Science Experience is born on the cloud. It adapts to open source innovation. And the Data Science Experience grows stronger as more and more data scientists around the globe create solutions based on Spark. Further, the ecosystem for The Data Science Experience is open and available. We are proud to have partners like H20, RStudio, Lightbend, and Galvanize, to name a few.

With Data Science Experience, the discipline of data science can now accomplish exponentially greater outcomes. It’s the difference between a shiny car sitting in a garage, and crossing the finish line at 230 miles per hour.

***

IBM is building the next generation analytics platform in the cloud.

1. It started with our investment in Apache Spark as the Analytics O/S, last year.
2. It continues today, as we launch the first IDE for this new way of thinking about data & analytics.
3. Over time, this will evolve as the platform for an enterprise in the data era.

All of this is enabled by Spark.

***

In June 2015, we announced IBM’s commitment to Apache Spark. In closing, I want to provide some context on our progress in the last year. If you missed it last year, here is why I believe Spark is will be a critical force in technology, on the same scale as Linux.

So, what have we accomplished and where are we going?

1) We continue to expand the Spark Technology Center (STC). We opened an STC in India. We continue to hire aggressively. And, later this year, we will move into our new home on Howard St. in San Francisco.

2) Client traction and response has been phenomenal. We have 40+ client references already and more on the way.

3) We have open sourced SystemML as promised and we are working on it with the community, in the open. This contribution is over 100,000 lines of code. SystemML was accepted into Apache as an official Incubator project as of November 2015. Since it was open-sourced, 859 contributions have been made to the project (i.e. a build-out of the Spark backend, API improvements; usability with Scala Spark & PySpark notebooks for data science, experimental work into deep learning, etc.)

4) For Spark 1.6.x, a total of 29 team members contributed to the release (26 of them from the STC), and each contributing engineer is a credited contributor in the release notes of Spark 1.6.x. For Spark 2.0, 31 STC developers have contributed to Spark 2.0 thus far. This is still in progress

5) Our Spark specific JIRAs have been almost 25,000 lines of code. You can watch them in action here. Much of our focus has been on SQL, MLlib, and PySpark.

6) We launched the Open Source Analytics Ecosystem and are working closely with partners like Databricks, Lightbend, RStudio, H20, and many others. We welcome all.

7) We have trained ~400,000 data scientists through a number of forums, including BigDataUniversity.com.

8) Adoption of the Spark Service on IBM Cloud continues to grow exponentially, as users seek access to the Analytics Operating System.

9) We have over 30 IBM products that are leveraging Spark and many more in the pipeline.

10) We launched a Spark newsletter. Anyone can subscribe here.

11) Lastly, we have launched a Spark Advisory Council. Over 25 leading enterprises and partners — Spark experts building new companies and established industry leaders building new platforms — participate in this regular dialogue about their experiences with Spark and the direction of the Spark project. We use this thinking to focus our efforts in the Spark Technology Center. All are welcome. Contact us here if you are interested.

***

Data Science is a team sport. Spark is the enabler. This is why I stated last year that anyone using data will be leveraging Spark in the future. That future is quickly arriving.

Winning in Formula One is about speed, performance, precision, and collaboration. Those that find the winners circle have found a way to integrate the components (human and material) to act as ONE. The same opportunity exists in Analytics and Data Science. Let’s make data science a team sport. Welcome to the first enterprise app for the Analytics Operating System: The Data Science Experience.