Pages

Friday, February 5, 2016

Gather the Fruit and Burn the Tree




There is an old story about a gentleman walking through the countryside and he comes upon a plum orchard. As he is walking through the orchard, he notices a plum tree with fruit that is ripe on the vine, but it has crashed into ruins on the ground. He starts to survey the tree to determine if it has collapsed due to the weight of the fruit or a recent storm. The farmer of the orchard walks up and promptly shows the man that insects have eaten through a good portion of the tree, causing its collapse. The man turns to the farmer and says, “Well, what do you do now?” The farmer replies, “It’s time to gather the fruit and burn the tree.”

***

Finding and cultivating the right mentors will change your life and career. While I have a number of people asking me to act as their mentor, I feel like I’m a pretty average mentor, when it comes to the particular task. One of the main reasons I agree to act as a mentor is because I learn a lot in the process. In many cases, it’s not me telling the mentee what they need to hear, it’s me saying what we both need to hear. One of my goals over the next 12 months is to become a better mentor, in order to really help people raise their own personal bar of achievement. I want to provide whatever ‘fruit’ is needed, and hopefully propel them onto whatever is next.

I believe that a mentor is fundamentally responsible for doing 4 things:

1) Inspire
2) Teach
3) Encourage
4) Positively effect

Listening is not good enough. Neither is simply giving advice. There has to be more, and I think doing 2 or more of the above, in any interaction, is a worthy goal.

***

In order to choose or act as a mentor, I think it takes more than understanding the responsibilities of the role; you also have to understand what makes a good mentor. I listened to a podcast recently that was talking about board members of public/private companies. The assertion was that a person is qualified as a board member (“quad-qualified”) if they have the following attributes:

1) Independence
2) Bandwidth
3) Motivation
4) Expertise

For anyone looking for a mentor, you should ensure that your mentor is “quad-qualified”:

1) Independence- if you work for them, they are not independent. If you work with them, they may not be independent. You need someone truly independent.
2) Bandwidth- In addition to their job, how many other people are they mentoring? If they don’t have the bandwidth to spend time with you, neither of you will get maximum value out of the relationship.
3) Motivation- Are they personally motivated in helping you out? Do you have a past together, such that they would personally invest in you? In my experience, “blind date mentorships” don’t work due to a lack of motivation.
4) Expertise- Do they work and live in a similar environment, such that they can provide relevant expertise? Or, do you intentionally need someone that has a different background/expertise?

If you are seeking a mentor, take the process seriously and ensure that they are quad-qualified. If not, you are probably both wasting your time.

***

I think the only way that I can really help as a mentor is to help others think about their specific situations/careers, through a different set of eyes. It may be helping them with skill development (encourage/train), it may be helping them see the bigger picture (inspire), or perhaps just sharing how I have approached similar situations (teach and positively effect). I typically try to help mentees in a few ways:

- Getting comfortable, being uncomfortable.
- Challenge them on their preparation: what they read, study, etc.
- Understand what they look forward to every day and how its relevant to success
- Prioritize: what’s the one thing you can do this week, such by doing it everything else would be easier?
- Apply Pareto to their to-do list (3 P’s, etc)
- Understand and apply the Rockefeller habits (Priorities, Rhythm, Data)
- Be on offense
- Career decisions
- Feedback loops
- Pre-mortems
- 7 measures of Good Enough
- Hill climbing
- Goal setting

One of the primary themes that I have noticed across most of my mentor/mentee relationships is that most people tend to overvalue the near term and undervalue long term awards. This is a shame, because I see people making sub-optimal decisions and they set off on the wrong hill. I try to remind people is that you can always trade up…but sometimes its best to defer the easy step and focus on your real goals.

***

Gather the Fruit and Burn the Tree. For a mentee, a mentor/mentee relationship is about absorbing whatever you can and then focusing on what’s next. While these can certainly be long term relationships, they don’t have to be; I don’t think any great mentor has that expectation. Everyone should cultivate mentors and mentees throughout their career; gather what they can, and focus on the next step.

Monday, December 21, 2015

Second-Level Thinking






Howard Marks is a well respected investor and the founder of Oaktree Capital Management. In a recent letter to investors, he introduced a concept that he calls 'Second-Level Thinking'. In his words:

This is a crucial subject that has to be understood by everyone who aspires to be a superior investor. Remember your goal in investing isn’t to earn average returns; you want to do better than average. Thus your thinking has to be better than that of others – both more powerful and at a higher level. Since others may be smart, well-informed and highly computerized, you must find an edge they don’t have. You must think of something they haven’t thought of, see things they miss, or bring insight they don’t possess. You have to react differently and behave differently. In short, being right may be a necessary condition for investment success, but it won’t be sufficient. You must be more right than others . . . which by definition means your thinking has to be different. . .

For your performance to diverge from the norm, your expectations have to diverge from the norm, and you have to be more right than the consensus. Different and better: that’s a pretty good description of second-level thinking.

Second-level thinking is deep, complex and convoluted.

Certainly, he sets a high mark for how to stretch our thinking.

In the context of the technology industry, I would use the following examples to contrast first-level and second-level thinking around building products:

First-level thinking says, “Clients are asking for this; this functionality will fill a need.” Second-level thinking says, “It’s something that our clients are asking for, but everyone is asking for that. Therefore, every competitor is pursuing that and its just a race to the finish and will quickly commoditize; let’s go in a different direction.”

First-level thinking says, “The IT analyst firms say this market will have low growth and most companies already have the capability. Let’s focus on a different market.” Second-level thinking says, “The outlook stinks, but everyone else is abandoning it. We could reinvent how clients are consuming in this critical area. Double down!”

These are rudimentary and simple, but hopefully sufficient examples for how Second-Level Thinking may apply in the technology industry.


***

Market Forces at Work


We are in an unprecedented business cycle. Protracted low interest rates have discouraged saving, and therefore money is put to work. At the same time, the rise of activist investors has altered traditional approaches to capital allocation. Public companies are being pushed to monetize their shareholders investments, either in the form of dividends or buybacks (and most often both). Because of this non-relenting pressure on public companies, investment has begun to flow more drastically towards private enterprises (at later and later stages), leading to the 'unicorn' phenomena. These 'unicorn' companies, which have the time and resources in their current form, are doing 3 things:

1) Paying anything for talent, causing wage inflation for engineers and some other roles.
2) Attempting to re-invent many industries, by applying technology and in many cases, shifting them to a pay-as-you-go (or as-a-service) model.
3) Spending aggressively, in any form necessary, to drive growth.

Public companies, in some cases, are crowded-out of the investments they would normally make, given this landscape. But, a central truth remains: at some point, an enterprise must make money. That timeline is typically compressed when capital begins to dry up. The term 'unicorn' was first used to connote something that is rarely seen. The fact that they are now on every street corner is perhaps an indication that time is short.

***

The Impact

1) "Winter is coming" for the engineering wage cycle. Currently, this inflation is driven in part by supply/demand but more so by the cult of "free money" and nothing else better to do with it. At some point, when 'hire at any cost' dissipates, we will know who has truly built differentiated skills.

2) The rise of cloud and data science will eliminate 50% of traditional IT jobs over the next decade. Read more here. The great re-skilling must start now, for companies that want to lead in the data era. Try this.

3) As-a-service is a cyclical change (not secular). The length of the cycle is anyones guess. And, as with most cycles, it will probably last longer and end faster, than most people believe. Much of this cycle is driven by the market forces described above (less money for capex, since all of it is being spent on buybacks/dividends). At some point, companies will realize that 'paying more in perpetuity' is not a good idea, and there will be a Reversion to the Mean.

4) Centralized computing architectures (cloud) will eventually diminish in importance. Right now, we are in a datacenter capital arms race, much like the Telco's were in 1999. But, as edge devices (smartphones, IoT, etc.) continue to advance and the world is blanketed with super computers, there will be less of a need for a centralized processing center.

5) Machine Learning is the new 'Intel inside'. This will become a default capability in every product/device, instrumenting business processes and decision making. This will put even more pressure on the traditional definition of roles in an organization.

6)There is now general agreement that data is a strategic asset. Because of this, many IT and Cloud providers are seeking to capture data, under the notion that 'data has gravity'. Once it is captured, the belief goes, it is hard to move, and therefore can be monetized. While I understand that in concept, its not very user centric. Who likes having their data trapped? No one. Therefore, I believe the real winners in this next cycle will be those that can enable open and decentralized data access. This is effectively the opposite of capturing it. It's enabling a transparent and open architecture, with the ability to analyze and drive insights from anywhere. Yet another reason to believe in Spark.

***

It's debatable if the 6 impacts above represent Second-Level Thinking. While they may to some extent, the real thinking would be to flesh out the implications of each, and place bets on the implications. These are bets that could be made in the form of financial investments, product investments, or "start a new company" investments.

Monday, December 14, 2015

Transforming Customer Relationships with Data

BUYING A HOUSE

A friend walked into a bank in a small town in Connecticut. As frequently portrayed in movies, the benefit of living in a small town is that you see many people that you know around town and often have a first name relationship with local merchants. It’s very personal and something that many equate to the New England charm of a town like New Canaan. As this friend, let us call him Dan, entered the bank, it was the normal greetings by name, discussion of the recent town fair, and a brief reflection on the weekend’s Little League games.

Dan was in the market for a home. Having lived in the town for over ten years, he wanted to upsize a bit, given that his family was now 20-percent larger than when he bought the original home. After a few months of monitoring the real estate listings, working with a local agent (whom he knew from his first home purchase), Dan and his wife settled on the ideal house for their next home. Dan’s trip to the bank was all business, as he needed a mortgage (much smaller than the one on his original home) to finance the purchase of the new home.

The interaction started as you may expect: “Dan, we need you to fill out some paperwork for us and we’ll be able to help you.” Dan proceeded to write down everything that the bank already knew about him: his name, address, Social Security number, date of birth, employment history, previous mortgage experience, income level, and estimated net worth. There was nothing unusual about the questions except for the fact that the bank already knew everything they were asking about.

After he finished the paperwork, it shifted to an interview, and the bank representative began to ask some qualitative questions about Dan’s situation and needs, and the mortgage type that he was looking for. The ever-increasing number of choices varied based on fixed versus variable interest rate, duration and amount of the loan, and other factors.

Approximately 60 minutes later, Dan exited the bank, uncertain of whether or not he would receive the loan. The bank knew Dan. The bank employees knew his wife and children by name, and they had seen all of his deposits and withdrawals over a ten-year period. They’d seen him make all of his mortgage payments on time. Yet the bank refused to acknowledge, through their actions, that they actually knew him.

BRIEF HISTORY OF CUSTOMER SERVICE

There was an era when customer support and service was dictated by what you told the person in front of you, whether that person was a storeowner, lender, or even an automotive dealer. It was then up to that person to make a judgment on your issue and either fix it or explain why it could not be fixed. That simpler time created a higher level of personal touch in the process, but then the telephone came along. The phone led to the emergence of call centers, which led to phone tree technology, which resulted in the decline in customer service.

CUSTOMER SERVICE OVER TIME

While technology has advanced exponentially since the 1800s, customer experience has not advanced as dramatically. While customer interaction has been streamlined and automated in many cases, it is debatable whether or not those cost-focused activities have engendered customer loyalty, which should be the ultimate goal.

The following list identifies the main historical influences on customer service. Each era has seen technological advances and along with that, enhanced interaction with customers.


Pre-1870: In this era, customer interaction was a face-to-face experience. If a customer had an issue, he would go directly to the merchant and explain the situation. While this is not scientific, it seems that overall customer satisfaction was higher in this era than others for the simple fact that people treat a person in front of them with more care and attention than they would a person once or twice removed.

1876: The telephone is invented. While the telephone did not replace the face-to-face era immediately, it laid the groundwork for a revolution that would continue until the next major revolution: the Internet.

1890s: The telephone switchboard was invented. Originally, phones worked only point-to-point, which is why phones were sold in pairs. The invention of the switchboard opened up the ability to communicate one-to-many. This meant that customers could dial a switchboard and then be directly connected to the merchant they purchased from or to their local bank.

1960s: Call centers first emerged in the 1960s, primarily a product of larger companies that saw a need to centralize a function to manage and solve customer inquiries. This was more cost effective than previous approaches, and perhaps more importantly, it enabled a company to train specialists to handle customer calls in a consistent manner. Touch-tone dialing (1963) and 1-800 numbers (1967) fed the productivity and usage of call centers.

1970s: Interactive Voice Response (IVR) technology was introduced into call centers to assist with routing and to o er the promise of better problem resolution. Technology for call routing and phone trees improved into the 1980s, but it is not something that ever engendered a positive experience.

1980s: For the first time, companies began to outsource the call-center function. The belief was that if you could pay someone else to offer this service and it would get done at a lower price, then it was better. While this did not pick up steam until the 1990s, this era marked the first big move to outsourcing, and particularly outsourcing overseas, to lower- cost locations.

1990s to present: This era, marked by the emergence of the Internet, has seen the most dramatic technology innovation, yet it’s debatable whether or not customer experience has improved at a comparable pace. e Internet brought help desks, live chat support, social media support, and the widespread use of customer relationship management (CRM) and call-center software.

Despite all of this progress and developing technology through the years, it still seems like something is missing. Even the personal, face-to-face channel (think about Dan and his local bank) is unable to appropriately service a customer that the employees know (but pretend not to, when it comes to making business decisions)
.
While we have seen considerable progress in customer support since the 1800s, the lack of data in those times prevented the intimate customer experience that many longed for. It’s educational to explore a couple pre-data era examples of customer service, to understand the strengths and limitations of customer service prior to the data era.


BOEING

The United States entered World War I on April 6, 1917. The U.S. Navy quickly became interested in Boeing’s Model C seaplane. e seaplane was the rst “all-Boeing” design and utilized either single or dual pontoons for water landing. e seaplane promised agility and exibility, which were features that the Navy felt would be critical to managing the highly complex environment of a war zone. Since Boeing conducted all of the testing of the seaplane in Pensacola, Florida, this forced the company to deconstruct the planes, ship them to the west coast of the United States (by rail). en, in the process, they had to decide whether or not to send an engineer and pilot, along with the spare parts, in order to ensure the customer’s success. This is the pinnacle of customer service: knowing your customers, responding to their needs, and delivering what is required, where it is required. Said another way, the purchase (or prospect of purchase) of the product assumed customer service.

The Boeing Company and the Douglas Aircraft Company, which would later merge, led the country in airplane innovation. As Boeing expanded after the war years, the business grew to include much more than just manufacturing, with the advent of airmail contracts and a commercial ight operation known as United Air Lines. Each of these expansions led to more opportunities, namely around a training school, to provide United Air Lines an endless supply of skilled pilots.

In 1936, Boeing founded its Service Unit. As you might expect, the first head of the unit was an engineer (Wellwood Beall). After all, the mission of the unit was expertise, so a top engineer was the right person for the job. As Boeing expanded overseas, Beall decided he needed to establish a division focused on airplane maintenance and training the Chinese, as China had emerged as a top growth area.

When World War II came along, Boeing quickly dedicated resources to training, spare parts, and maintaining fleets in the conflict. A steady stream of Boeing and Douglas field representatives began flowing to battlefronts on several continents to support their companies’ respective aircraft. Boeing put field representatives on the front lines to ensure that planes were operating and, equally importantly, to share information with the company engineers regarding needed design improvement.

Based on lessons learned from its rst seven years in operation, the service unit reorganized in 1943, around four areas:

-Maintenance publications
-Field service
-Training
-Spare parts

To this day, that structure is still substantially intact. Part of Boeing’s secret was a tight relationship between customer service technicians and the design engineers. This ensured that the Boeing product-development team was focused on the things that mattered most to their clients.

Despite the major changes in airplane technology over the years, the customer-support mission of Boeing has not wavered: “To assist the operators of Boeing planes to the greatest possible extent, delivering total satisfaction and lifetime support.” While customer service and the related technology has changed dramatically through the years, the attributes of great customer service remains unchanged. We see many of these attributes in the Boeing example:

1. Publications: Sharing information, in the form of publications available to the customer base, allows customers to “help themselves.”

2. Teamwork: The linkage between customer support and product development is critical to ensuring client satisfaction over a long period of time.

3. Training: Similar to the goal with publications, training makes your clients smarter, and therefore, they are less likely to have issues with the products or services provided.

4. Field service: Be where your clients are, helping them as it’s needed.

5. Spare parts: If applicable, provide extra capabilities or parts needed to achieve the desired experience in the field.

6. Multi-channel: Establishing multiple channels enables the customer to ask for and receive assistance.

7. Service extension: Be prepared to extend service to areas previously unplanned for. In the case of Boeing, this was new geographies (China) and at unanticipated time durations (supporting spare parts for longer than expected).

8. Personalization: Know your customer and their needs, and personalize their interaction and engagement.

Successful customer service entails each of these aspects in some capacity. The varied forms of customer service depend largely on the industry and product, but also the role that data can play.


FINANCIAL SERVICES

There are a multitude of reasons why a financial services firm would want to invest in a call center: lower costs and consolidation; improved customer service, cross-selling, and extended geographical reach.

Financial services have a unique need for call centers and expertise in customer service, given that customer relationships are ultimately what they sell (the money is just a vehicle towards achieving the customer relationship). Six of the most prominent areas of financial services for call centers are:

1) Retail banking: Supporting savings and checking accounts, along with multiple channels (online, branch, ATM, etc.)

2) Retail brokerage:
Advising and supporting clients on securities purchases, funds transfer, asset allocation, etc.

3) Credit cards: Managing credit card balances, including disputes, limits, and payments

4) Insurance: Claims underwriting and processing, and related status inquiries

5) Lending: Advising and supporting clients on securities purchases, funds transfer, asset allocation, etc.

6) Consumer lending: A secured or unsecured loan with fixed terms issued by a bank or financing company. is includes mortgages, automobile loans, etc.

Consumer lending is perhaps the most interesting financial services area to explore from the perspective of big data, as it involves more than just responding to customer inquiries. It involves the decision to lend in the first place, which sets off all future interactions with the consumer.

There are many types of lending that fall into the domain of consumer lending, including credit cards, home equity loans, mortgages, and financing for cars, appliances, and boats, among many other possible items, many of which are deemed to have a finite life.

Consumer lending can be secured or unsecured. This is largely determined by the feasibility of securing the loan (it’s easy to secure an auto loan with the auto, but it’s not so easy to secure credit card loans without a tangible asset), as well as the parties’ risk tolerance and specific objectives about the interest rate and the total cost of the loan. Unsecured loans obviously will tend to have higher returns (and risk) for the lender.

Ultimately, from the lender’s perspective, the decision to lend or not to lend will be based on the lender’s belief that she will get paid back, with the appropriate amount of interest.

A consumer-lending operation, and the customer service that would be required to manage the relationships, is extensive. Setting it up requires the consideration of many factors:

Call volumes: Forecasting monthly, weekly, and hourly engagement Sta ng: Calibrating on a monthly, weekly, and hourly basis, likely based on expected call volumes

Performance management: Setting standards for performance with the staff, knowing that many situations will be unique

Location: Deciding on a physical or virtual customer service operation, knowing that this decision impacts culture, cost, and performance

A survey of call center operations from 1997, conducted by Holliday, showed that 64 percent of the responding banks expected increased sales and cross sales, while only 48 percent saw an actual increase. Of the responding banks, 71 percent expected the call center to increase customer retention; however, only 53 percent said that it actually did.

The current approach to utilizing call centers is not working and ironically, has not changed much since 1997.


THE DATA ERA

Data will transform customer service, as data can be the key ingredient in each of the aspects of successful customer service. The lack of data or lack of use of data is preventing the personalization of customer service, which is the reason that it is not meeting expectations.

In the report, titled “Navigate the Future Of Customer Service” (Forrester, 2012), Kate Leggett highlights key areas that depend on the successful utilization of big data. These include: auditing the customer service ecosystem (technologies and processes supported across different communication channels); using surveys to better understand the needs of customers; and incorporating feedback loops by measuring the success of customer service interactions against cost and satisfaction goals.


AN AUTOMOBILE MANUFACTURER

Servicing automobiles post-sale requires a complex supply chain of information. In part, this is due to the number of parties involved. For example, a person who has an issue with his car is suddenly dependent on numerous parties to solve the problem: the service department, the dealership, the manufacturer, and the parts supplier (if applicable). That is four relatively independent parties, all trying to solve the problem, and typically pointing to someone else as being the cause of the issue.

This situation can be defined as a data problem. More specifically, the fact that each party had their own view of the problem in their own systems, which were not integrated, contributed to the issue. As any one party went to look for similar issues (i.e. queried the data), they received back only a limited view of the data available.

A logical solution to this problem is to enable the data to be searched across all parties and data silos, and then reinterpreted into a single answer. The challenge with this approach to using data is that it is very much a pull model, meaning that the person searching for an answer has to know what question to ask. If you don’t know the cause of a problem, how can you possibly know what question to ask in order to fix it?

This problem necessitates data to be pushed from the disparate systems, based on the role of the person exploring and based on the class of the problem. Once the data is pushed to the customer service representatives, it transforms their role from question takers to solution providers. They have the data they need to immediately suggest solutions, options, or alternatives. All enabled by data.


ZENDESK

Mikkel Svane spent many years of his life implementing help-desk so ware. The complaints from that experience were etched in his mind: It’s difficult to use, it’s expensive, it does not integrate easily with other systems, and it’s very hard to install. This frustration led to the founding of Zendesk.

As of December 2013, it is widely believed that Zendesk has over 20,000 enterprise clients. Zendesk was founded in 2007, and just seven short years later, it had a large following. Why? In short, it found a way to leverage data to transform customer service.

Zendesk asserts that bad customer service costs major economies around the world $338 billion annually. Even worse, they indicate that 82 percent of Americans report having stopped doing business with a company because of poor customer service. In the same vein as Boeing in World War II, this means that customer service is no longer an element of customer satisfaction; it is perhaps the sole determinant of customer satisfaction.

A simplistic description of Zendesk would highlight the fact that it is email, tweet, phone, chat, and search data, all integrated in one place and personal- ized for the customer of the moment. Mechanically, Zendesk is creating and tracking individual customer support tickets for every interaction. The interaction can come in any form (social media, email, phone, etc.) and therefore, any channel can kick off the creation of a support ticket. As the support ticket is created, a priority level is assigned, any related history is collated and attached, and it is routed to a specific customer-support person. But, what about the people who don’t call or tweet, yet still have an issue?

Zendesk has also released a search analytics capability, which is programmed using sophisticated data modeling techniques to look for customer issues, instead of just waiting for the customer to contact the company. A key part of the founding philosophy of Zendesk was the realization that roughly 35 percent of consumers are silent users, who seek their own answers, instead of contacting customer support. On one hand, this is a great advantage for a company, as it reduces their cost of support. But it is fraught with risk of customer satisfaction issues, as a customer may decide to move to a competitor without the incumbent ever knowing they needed help.

Svane, like the executives at Boeing in the World War II era, sees customer service as a means to build relationships with customers, as opposed to a hindrance. He believes this perspective is starting to catch on more broadly. “What has happened over the last five or six years is that the notion of customer service has changed from just being this call center to something where you can create real, meaningful long-term relationships with your customers and think about it as a revenue center.”

BUYING A HOUSE (CONTINUED)

It would be very easy for Dan to receive a loan and for the bank to under- write that loan if the right data was available to make the decision. With the right data, the bank would know who he is, as well as his entire history with the bank, recent significant life changes, credit behavior, and many other factors. This data would be pushed to the bank representative as Dan walked in the door. When the representative asked, “How can I help you today?” and learned that Dan was in the market for a new home, the representative would simply say, “Let me show you what options are available to you.” Dan could make a spot decision or choose to think about it, but either way, it would be as simple as purchasing groceries. at is the power of data, transforming customer service.

This post is adapted from the book, Big Data Revolution: What farmers, doctors, and insurance agents teach us about discovering big data patterns, Wiley, 2015. Find more on the web at http://www.bigdatarevolutionbook.com

Monday, December 7, 2015

A Universal Translator for Data



Anybody who travels to a foreign country or reads a book or newspaper written in a language they don’t speak understands the value of a good translation. Yet, in the realm of Big Data, application developers face huge challenges when combining information from different sources and when deploying data-heavy applications to different types of computers. What they need is a good translator.

That’s why IBM has donated to the open source community SystemML, which is a universal translator for Big Data and the machine learning algorithms that are becoming essential to processing it. System ML enables developers who don’t have expertise in machine learning to embed it in their applications once and use it in industry-specific scenarios on a wide variety of computing platforms, from mainframes to smartphones.

Today, we’re announcing that Apache, one of the leading open source organizations in the world, has accepted SystemML as an official Apache Incubator project—giving it the name Apache SystemML.

We open sourced SystemML in June when we threw our weight behind the Apache Spark project—which enables developers and data scientists to more easily integrate Big Data analytics into applications.

We believe that Apache Spark is the most important new open source project in a decade. We’re embedding Spark into our Analytics and Commerce platforms, offering Spark as a service on IBM Cloud, and putting more than 3,500 IBM researchers and developers to work on Spark-related projects.

Apache SystemML is an essential element of the Spark ecosystem of technologies. Think of Spark as the analytics operating system for any application that taps into huge volumes of streaming data. MLLib, the machine learning library for Spark, provides developers with a rich set of machine learning algorithms. And SystemML enables developers to translate those algorithms so they can easily digest different kinds of data and to run on different kinds of computers.

SystemML allows a developer to write a single machine learning algorithm and automatically scale it up using Spark or Hadoop, another popular open source data analytics tool, saving significant time on behalf of highly skilled developers. While other tech companies have open sourced machine learning technologies as well, most of those are specialized tools to train neural networks. They are important, but niche, and the ability to ease the use of machine learning within Spark or Hadoop will be critical for machine learning to really become ubiquitous in the long run.

In the coming years, all businesses and, indeed, society in general, will come to rely on computing systems that learn—what we call cognitive systems. This kind of computer learning is critical because the flood of Big Data makes it impossible for organizations to manually train and program computers to handle complex situations and problems—especially as they morph over time. Computing systems must learn from their interactions with data.

The Apache SystemML project has achieved a number of early milestones to date, including:

–Over 320 patches including APIs, Data Ingestion, Optimizations, Language and Runtime Operators, Additional Algorithms, Testing, and Documentation.

–90+ contributions to the Apache Spark project from more than 25 engineers at the IBM Spark Technology Center in San Francisco to make Machine Learning accessible to the fastest growing community of data science professionals and to various other components of Apache Spark.

–More than 15 contributors from a number of organizations to enhance the capabilities to the core SystemML engine.

One of the Apache SystemML committers, D.B.Tsai, had this to say about it: “SystemML not only scales for big data analytics with high performance optimizer technology, but also empowers users to write customized machine learning algorithms using simple domain specific language without learning complicated distributed programming. It is a great extensible complement framework of Spark MLlib. I’m looking forward to seeing this become part of Apache Spark ecosystem.”

We are excited too. We believe that open source software will be an essential element of big data analytics and cognitive computing, just at it has been critical to the advances that have come in the Internet and cloud computing. The more tech companies and developers share resources and combine our efforts, the faster information technology will transform business and society.

Original blog post here.

Wednesday, November 4, 2015

The Calm Before the Storm


Have you ever spent an afternoon in the backyard, maybe grilling or enjoying a game of croquet, when suddenly you notice that everything goes quiet? The air seems still and calm -- even the birds stop singing and quickly return to their nests.

After a few minutes, you feel a change in the air, and suddenly a line of clouds ominously appears on the horizon -- clouds with a look that tells you they aren't fooling around. You quickly dash in the house and narrowly miss the first fat raindrops that fall right before the downpour. At this moment, you might stop and ask yourself, "Why was it so calm and peaceful right before the storm hit?"
-How Stuff Works

The last 5 years, with the onset of Hadoop, Cloud, and Mobile was merely the calm before the storm. There is a new modern technology stack, a data revolution, and the onslaught of machine learning that will shape the storm to come over the next decade.


***

The mobile supply chain has wreaked havoc on the traditional technology stack. With the advent of high volume chips, screens, storage, etc. it has become cost effective to move away from a vertically integrated architecture, to one that is much more flexible and dynamic. We have evolved to a 6 layer Next Generation Technology Stack:


Layer 1: There are 2 aspects to layer 1: a) the repositories and b) the data itself. The repositories include the new breed of flexible and fluid data layers, ranging from Hadoop to Cassandra to other NoSQL data stores. Very flexible, adaptable, and tuned to modern internet and mobile applications. This also includes databases, data warehouses, and mainframes; said another way, anything that stores data of strategic and operational relevance. Within the repositories, the data itself creates a competitive moat, and offers strategic advantage when used appropriately.

Layer 2: A highly performant processing layer, which enables access to all data in a unified way, and easily incorporates machine learning and produces real-time insights. This is why I have called Spark the Analytics Operating System.

Layer 3: Machine learning, on a corpus of strategically relevant data, is the new competitive moat for an enterprise. This layer automates the application of analytics and delivers real-time insights for business impact. It's the holy grail that has never quite been found in most organizations.

Layer 4: A unified application layer, which provides seamless access to analytical models, data, and insights. This is the glue that enables most business users to leverage and understand data-rich applications.

Layer 5: The easiest way to democratize access to data in an organization is to give users something elegant and insightful. Vertical and horizontal applications, built for a specific purpose serve this role in an organization.

Layer 6: The number of people connected to the Internet has surged from 400 million in 1999 to 3 billion last year. The number of connected devices is estimated at 50 billion by 2020. These are all access points for the Next Generation Technology Stack.



***

In Big Data Revolution, I dissected 3 Business Models for the Data Era. In summary, there are 3 dominant business models that I see emerging:

Data as a competitive advantage: While this is somewhat incremental in its approach, it is evident that data can be utilized and applied to create a competitive advantage in a business. For example, an investment bank, with all the traditional processes of a bank, can gain significant advantage by applying advanced analytics and data science to problems like risk management. While it may not change the core functions or processes of the bank, it enables the bank to perform them better, thereby creating a market advantage.

Data as improvement to existing products or services: This class of business model plugs data into existing offerings, effectively differentiating them in the market. It may not provide competitive advantage (but it could), although it certainly differentiates the capabilities of the company. A simple example could be a real estate firm that utilizes local data to better target potential customers and match them to vacancies. This is a step beyond the data that would come from the Multiple Listing Service (MLS). Hence, it improves the services that the real estate firm can provide.

Data as the product: This class of business is a step beyond utilizing data for competitive advantage or plugging data into existing products. In this case, data is the asset or product to be monetized. An example of this would be Dun & Bradstreet, which has been known as the definitive source of business-related data for years.

Since my work on business models was published, my thinking has evolved a bit. While I think each of those business models is still valid, I am less certain that any of them on their own will create a distinctive competitive advantage. Instead, I believe that the value is where the software meets the data, and access is democratized. Said another way, it's hard to create value by only looking at one layer of the Next Generation Technology Stack.

Enter The Weather Company...

***

Last week, we announced our intention to acquire The Weather Company. The media reaction has ranged from, "IBM is buying a TV station?" (we are not), to "IBM is buying the clouds", to "IBM is entering the data business." Some of the reactions are wrong, others are humorous, and some are overly simplistic. The reality is that IBM has just made a significant move in defining and leading in the Next Generation Technology Stack. This interview captures it well. Let's look at this in terms of each layer:

Layer 1: IBM has long been a leader in Layer 1, around all types of repositories. From Netezza to DB2 to the mainframe to Informix to Cloudant to BigInsights to enterprise content; most of the worlds valuable enterprise data is stored in IBM technology. With The Weather Company, we now have a rich set of data assets. The Weather Company can decompose what is happening on earth into over 3 billion elements. And, its not just weather data. In an increasingly mobile world, location matters.

Layer 2: IBM is the enterprise leader in Spark. Through a variety of partnerships like Databricks and Typesafe, we are a key part of this blossoming community.

Layer 3: Through our open source contributions to Machine Learning, our rich portfolio of Analytical models, and the worlds greatest Cognitive system (Watson), IBM can provide applications (Layer 5) and insights that are unmatched. Just think how powerful Watson becomes when it understands location and environment, as well as everything else it already knows.

Layer 4: The Weather Company has an internet-scale high volume platform for IoT. It can seamlessly be extended for other sources of data and “can ingest data at a very high volume in fractions of a second that will be an engine that feeds Watson”.

Layer 5: IBM has a rich set of industry applications and solutions across Commerce, Watson, and countless other areas. The Weather Company applications and websites handle 65 billion unique accesses on weather and related data per day. This is scale that is unmatched.

Layer 6: The Weather Company mobile application has 66 million unique visitors a month and connective tissue to tap into the 50 billion connected devices that are emerging.

In summary, this is much more than weather data. Overnight, IBM has become the leader in the Next Generation Technology Stack. It is the basis for extension into financial services, automotive, telematics, healthcare, and every other industry being transformed by data.


***

It's always calm before the storm hits. Sometimes, in the moment, you don't even recognize the calm for what it is. My guess is that most people have not considered that last 5 years the calm before the storm. But, it was.

Thursday, October 22, 2015

Preparing your career for the data science revolution

I became familiar with Geoffrey Moore in 2002, while consulting at Symbol Technologies, helping it transform from a product-only company into a solutions company (that is, both products and services). The president of Symbol mentioned Moore’s work, and he often used the terms “core” and “context” when describing Symbol’s business operations. As I began reading Moore, his ideas crystallized in my mind, developing in me a growing appreciation for his work.

Innovating in a complex business environment

In 2005, Moore published Dealing with Darwin, offering a glimpse into how companies innovate during each phase of their evolution. Moore paints the picture of a business environment that is ever more competitive, globalized, deregulated and commoditized. Unsurprisingly, the combination of these forces puts immense pressure on companies to find ways of innovating in an increasingly complex environment.

Back in 2005, not all those forces were pronounced, but they have since become indisputably so. Indeed, not only are they present, but they are beginning to make many large companies irrelevant. The unbundling chart in Figure 1 demonstrates:

Source: https://www.cbinsights.com/blog/disrupting-banking-fintech-startups/

Competition and innovation have been forever changed. But such forces’ effects on companies merely marked a beginning. The next assault will be on individuals within companies. In the next five years, every employee will be dealing with Darwin personally.

Individuals first dealt with Darwin on substantial levels during the Industrial Revolution, continuing through the rest of the 1900s, when factories began replacing traditional small industries. As factories appeared, demand for labor heightened. But factory work was quite different from traditional work and quickly became known for its poor working conditions. Even worse, because most factory work did not require a particular strength or skill of its workers, workers were considered unskilled and were thus easily replaceable. At first, factory workers were replaced by other workers, but eventually they were automated away, forcing them to move to lower-cost countries.

Shifting roles in the workforce

Economists typically categorize three kinds of work in a country: agriculture (farming), industry (manufacturing) and services. Each type of work plays a role in the economy, but macroeconomic forces have changed the mix over time. As seen in Figure 2, although 70 percent of the labor force worked in agriculture in 1840, the agricultural share reduced to 40 percent by 1900 and today sits at a mere 2 percent. Such shifts force employees to adapt, dealing with Darwin at a very personal level.

Source: https://www.minnpost.com/macro-micro-minnesota/2012/02/history-lessons-understanding-decline-manufacturing

As Figure 3 shows, this shift has accelerated in the past 15 years, a period during which productivity has exploded:


Clearly, Darwin has struck at traditional workers and approaches. Thanks to widespread automation, broad acceptance of best practices and experience, skills that were once valued—and, indeed, that were necessary for industry—have been reduced to commodity services. Could the same thing happen to traditional IT jobs?

Redefining the skilled worker

In Big data revolution: what farmers, doctors, and insurance agents teach us about discovering big data patterns, I identify 54 big data patterns across a variety of industries. Whether you are a retailer, an insurance agent, a commercial banker or a doctor, these patterns will affect your role and the industry in which you play it. One pattern I identify is that of redefining the skilled worker:

The data era is demanding a new definition of skilled workers. While this may require skills like statistics or math, that is merely one aspect of the skill gap that must be filled. In medicine, it’s about redefining medical school to include skills like data analysis and data collection. In farming, the new skill set involves understanding how to utilize multiple sources of data (from drones, GPS, or otherwise) and apply that insight to deliver better yields and productivity.

The definition of a skilled worker has changed dramatically, based on different eras. In the 1700s and 1800s, skilled workers were defined by either their physical abilities or their knowledge of a certain craft (think bookkeeper). At the time of the Industrial Revolution, skilled workers were defined by their physical capacity to operate a machine or work on an assembly line. In the last 20 years, skilled workers have evolved further, with a premium placed on customer service and technology skills.

A skilled worker in the next decade will be defined by her ability to acquire, analyze, and utilize data and information. This new skilled worker will emerge in every industry, with a slightly different definition of the skill set for each industry.

The key patterns to redefine a skilled worker are

-Understanding the skill sets needed today, tomorrow, and further in the future, based on the potential for data disruption.
-Redefining roles and skill sets to take advantage of the new data available that can impact business processes.
-Training and retraining current and new workers is a distinguishing capability to remain relevant.

An organization must accept that the skills they have today may not be sufficient for the data era. The challenge is that the skill gaps, once understood, will likely prove to be excessively broad. The data era demands skills in data science, statistics, and probability. And that’s just on the business side. On the IT side, the skill needs are much different than traditional IT, with a premium placed on programming and modeling skills. Leading organizations will document their skills today versus the skills needed tomorrow and systematically begin to fill the inevitable gaps.

Though companies must understand their skill gaps and solve them at a company level, the challenge for individuals is even more pressing: What skill will you master? What skill aligns with the data era? For what craft will you be known? I believe that data science is the answer to all three questions.

Taking the cloud into account

The cloud has profoundly affected IT. Although moving to the cloud can help diminish the costs of starting a company and can help cut capital expenses, I believe that the cloud’s bigger effect will be on the traditional definition of a skilled IT worker. As organizations move to the cloud, I expect the importance of traditional IT roles and skills—systems administrator, architect, DBA, IT operations—to diminish or even be eliminated over the next 5 to 10 years. Make no mistake: I think this shift will take a long time to play out. But such a belief means that now is the time to prepare for the coming revolution.

Let’s imagine an organization having revenue of $1 billion, and let’s assume that it spends 6 percent of its revenue per year on IT. Figure 4 shows some related factors that we can calculate under such an assumption:


For such a company, the decision to shift to the cloud will be predicated on increasing leverage and efficiency and will force a rethinking of the workforce. There is good news—salaries of the employees will go up as they acquire rare and sophisticated skills—but as Figure 5 depicts, there is also bad: Traditional IT skills will be less in demand, for much such work will have been shifted to the cloud.


This shift presents both pitfall and opportunity for individuals. Indeed, the greatest risk lies in doing nothing—in merely continuing business as usual. More tools and education are available than ever have been for individuals who decide that they want to be leaders in the data era. To be blunt: Data science will be the data era’s defining skill. Though traditional IT skills will remain important for some, such skills will be increasingly less relevant in the cloud-centric data era.

Monday, September 14, 2015

IBM | Spark - An Update


IBM made some significant announcements around our investment in Spark, back in June (See here and here). 90 days later, I saw it fit to provide an update on where we stand in our community efforts.

***

Before I get to the details, I want to first re-state why I believe Spark is will be a critical force in technology, on the same scale as Linux.

1) Spark is the Analytics operating system: Any company interested in analytics will be using Spark.
2) Spark unifies data sources. Hadoop is one of many repositories that Spark may tap into.
3) The unified programming model of Spark, makes it the best choice for developers building data-rich analytic applications.
4) The real value of Spark is realized through Machine Learning. Machine Learning automates analytics and is the next great scale effect for organizations.

I was recently interviewed by Datanami on the topic of Spark. You can read it here. I like this article because it presents an industry perspective on Hadoop and Spark, but it's also clear on our unique point of view.

Also, this slideshare illustrates the point of view:


***

So, what have we accomplished and where are we going?

1) We have been hiring like crazy. As fast as we can. The STC in San Francisco is filling up and we have started to bring on community leaders like @cfregly .

2) Client traction and response has been phenomenal. We have references already and more on the way.

3) We have open sourced SystemML as promised (see on github) and we are working on it with the community...in the open. This contribution is over 100,000 lines of code.

4) Spark 1.5 was just released. We went from 3 contributors to 11, in one release cycle. Read more here.

5) Our Spark specific JIRAs have been ~5,000 lines of code. You can watch them in action here.

6) We are working closely with partners like Databricks and Typesafe.

7) We have trained ~300,000 data scientists through a number of forums, including BigDataUniversity.com. You can also find some deep technical content here.

8) We have seen huge adoption of the Spark Service on Bluemix.

9) We have ~10 IBM products that are leveraging​ Spark and many more in the pipeline.

10) We launched a Spark newsletter. Anyone can subscribe here.

***

Between now and the end of the year, we have 5 significant events, where we will have much more to share regarding Spark:

a) Strata NY- Sept 29-Oct 1 in New York, NY.
b) Apache Big Data Conference- Sept 28-30 in Budapest, Hungary.
c) Spark Summit Europe- Oct 27-29 in Amsterdam.
d) Insight- Oct 26-29 in Las Vegas.
e) Datapalooza- November 10-12 in San Francisco.


In closing, here is a peek inside: