Suggesting a 'pilot' is a weakness

This was my observation on Twitter/LinkedIn a couple weeks ago:

Nearly everything I share in such forums is based on actual events/observations, not theory. I didn’t expect any reaction. 82 comments later, I clearly struck a nerve. The interesting thing is the dichotomy of reaction. A portion of people think I don’t understand design thinking, MVP’s, and experimentation. Another portion vehemently agree with my statement. Given the passionate debate, I felt it appropriate to clarify my thinking. The background for this starts with a couple core beliefs I hold.

Belief #1: The job of a business executive is to maximize return on invested capital.

Many companies are myopically focused on growth. Growth is wonderful, but in isolation it ignores the fact that the true value of a business is determined by the discounted value of its future cash flows. The future cash flows ultimately determine what can be distributed to shareholders, and they can be maximized by growth, but also by optimizing profit margins and capital efficiency.

Read the rest on Medium here.

What I Read

I was speaking to a group earlier this week and someone asked me about my reading habits. I get this question somewhat frequently, and it's not an easy one to answer. My belief is that you have to start with things that you have a passionate interest in, and then see where that takes you. I have tried to force myself to read certain things/topics through the years, and it always fails, due to lack of sincere interest. This being said, I have developed a few consistent reading habits over the years, and here is a summary.

I find alot of my reading on Twitter. Here is who I follow. But again, follow people/things that interest you, to maximize your reading/learning. I use Pocket to save reading for later.

Wall Street Journal
Financial Times

The High Tech Strategist

GMO Letter
Greenlight Capital Letter

Berkshire Hathaway Annual Report
JPMorgan Chase Annual Report
Amazon Annual Report

People I Read About (or watch any speech they give)
Jim Chanos
Warren Buffett
Jamie Dimon
Howard Marks
Bob Iger
John Legere
Nick Saban
Urban Meyer
Jim Rogers
Ray Dalio
Clay Christensen
Nelson Peltz
Malcolm Gladwell
Bill Gurley
Mark Cuban
Marc Andreesen
Michael Mauboussin

Companies I Read About
3G Capital
Berkshire Hathaway
JPMorgan Chase

I try to read 1 book per week. Typically do not succeed. Here are my recent books read:

The Spirit of St. Louis (reading now)
Shoe Dog
Hellhound on His Trail
Monetizing Innovation
Too Soon Old, Too Late Smart
Extreme Ownership
American Icon
Double Your Profits in 6 months or less
Sam Walton: Made in America
Ben Franklin
A Short History of Nearly Everything
This Time It's Different
Medici Effect
The Hard Thing About Hard Things
MONEY Master the Game
The New Kingmakers
Franklin Barbecue
The Outsiders
Dream Big
10% Happier
The Everything Store
Nearing Home
The Year Without Pants
The Tao of Chip Kelly
The Leaders Code
The Energy Bus
Currency Wars
How Will You Measure You Life
Dethroning the King
Fooling Some People All of the Time

Everything Starts With Design — You Should, Too

“There is no such thing as a commodity product, just commodity thinking.” — David Townsend

Anthony DiNatale was born in South Boston. He entered the flooring business with his father in 1921 and began a career of craftsmanship and woodworking. In 1933, he founded DiNatale Flooring in Charlestown, working job to job, primarily in the northeast United States. In 1946, Walter Brown approached DiNatale and asked him to build a floor for a new basketball team to use. DiNatale quoted him $11,000 to complete the project, and the deal was struck.

DiNatale quickly went to work, knowing that he had to be cost-conscious to complete the construction, since he had bid aggressively to win the project. He gathered wood from a World War II army barracks and started building. He quickly noticed a problem: the wood scraps were too short for him to take his traditional approach to building a floor. So he began to create an alternating pattern, changing the direction of the wood pieces to fasten them together. He kept creating 5-foot panels, and when he had 247 of them, his work was completed.

Walter Brown was the owner of the Boston Celtics. When the Celtics moved into the Boston Garden in 1952, the floor commissioned by Brown in the year of their founding went with them. The floor was connected by 988 bolts and served as the playing surface for 16 NBA championships between 1957 and 1986.

DiNatale was a craftsman, an artist, a woodworker, but most prominently a designer. He made use of what he had and designed what would become the iconic playing surface in professional sports. The floor became a home-court advantage for the Celtics, as competitors complained about its dead spots and intricacies.

Design is enduring. Design is timeless. And, every once in a while, design becomes a major advantage.

Read the rest here.


"Don't confuse my kindness for weakness." - Joe Schutzman

I was a 26 year old consultant on my first real project in 2000, working for a client in Fort Lauderdale, Florida. The project was a mess. We had very little subject matter expertise, even less leadership, and it was going the wrong direction. Joe Schutzman was one of the people that came to the rescue. He brought a sense of calm and composure, and the ability to simplify a complex situation. And, he taught me the difference between being principled and being stubborn.

I've known a number of stubborn people in my life, probably most notably my Grandfather. It doesn't make them bad people, it just makes them, well, stubborn. Sometimes principled people get confused for stubborn, because they are so tied to their belief system. But, there is a big difference.

Joe was flexible on details, but determined on direction. This applied in business, as well as his life. This is the essence of a principled life. He was the champion of the pirate spirt in our team at work. I believe it originally came from the quote, "I'd rather be a pirate, than join the Navy." Regardless of where it originated, it embodied his spirit of never accepting the status quo. He was the consummate transformation agent, which is not necessarily common in someone so principled. There is often beauty in contradictions.

Joe passed away this morning, with his family in Kansas City. He will be missed, as a colleague, friend, father, husband, and sibling. But, his impact will live on, setting the bar for all of us to live a principled life, yet never being afraid to disrupt the status quo. He will be missed by all that knew him.

The End of Tech Companies

“If you aren’t genuinely pained by the risk involved in your strategic choices, it’s not much of a strategy.” — Reed Hastings

Enterprise software companies are facing unprecedented market pressure. With the emergence of cloud, digital, machine learning, and analytics (to name a few), the traditional business models, cash flows, and unit economics are under pressure. The results can be seen in some public stock prices (HDP, TDC, IMPV, etc.), and nearly everyone’s financials (flat to declining revenues in traditional spaces).

The results can also be seen in the number of private transactions occurring (Informatica, Qlik, etc.); it’s easier to change your business model outside of the public eye. In short, business models reliant on traditional distribution models, large dollar transactions, and human-intensive operations will remain under pressure.

Many ‘non-tech companies’ tell me, “thank goodness that is not the business we are in” or “technology changes too fast, I’m glad we are in a more traditional space”. These are false hopes. This fundamental shift is coming (or has already come) to every business and every industry, in every part of the world. It does not matter if you are a retailer, a manufacturer, a healthcare provider, an agricultural producer, or a pharma company. Your traditional distribution model, operational mechanics, and method of value creation will change in the next 5 years; you will either lead or be left behind.

It’s been said that we sit on the cusp of the next Industrial Revolution. Data, IoT, and software are replacing industrialization as the driving force of productivity and change. Look no further than the public markets; the 5 largest companies in the world by value are:

As Benedict Evans observed, “It is easier for software to enter other industries than for other industries to hire software people.” In the same vein, Naval Ravikant commented, “Competing without software is like competing without electricity.” The rise of the Data era, coupled with software and connected device sprawl, creates an opportunity for some companies to outperform others. Those who figure out how to apply this advantage will drive unprecedented wealth creation and comprise the new S&P 500.

This is the end of ‘tech companies’. The era of “tech companies” is over; there are only ‘companies’, steeped in technology, that will survive.

Read the rest on Medium here.

The 4th Dimension of Enterprise Software

Charles and Miranda first met in art school in 1979. Over time, they realized a shared passion for handwork and the elegance of handmade objects for the home. Today, Charles Shackleton Furniture and Miranda Thomas Pottery, the workshops that comprise ShackletonThomas, consist of a group of individuals who share their philosophy.

Charles and Miranda think about 4 elements when creating an object:

1) Design- the shape, decoration, functionality, and style.

2) Materials- they select the best and most beautiful materials for design.

3) Craftsmanship- the precision, finesse, and functionality for how an object is put together.

4) The Fourth Dimension- “this is the element of design caused when the object is made by human hand or a tool directly controlled by human hand. All are imperfect, like the human that created it. But, the imperfections are beautiful.”

The fourth dimension is the crucial and final aspect that makes a piece of art truly great. “This is what gives life and soul to the inanimate object.”


Every incumbent player in the enterprise software market is facing a 4th dimension challenge. The first 3 dimensions are the nearly the same for everyone; it’s how they invest their R&D/SG&A across serving users, their existing clients/products, and a platform for the future.

Read the rest here.

A Practical Guide to Machine Learning: Understand, Differentiate, and Apply

Co-authored by Jean-Francois Puget (@JFPuget)

Machine Learning represents the new frontier in analytics, and is the answer of how many companies can capitalize on the data opportunity. Machine Learning was first defined by Arthur Samuel in 1959 as a “Field of study that gives computers the ability to learn without being explicitly programmed.” Said another way, this is the automation of analytics, so that it can be applied at scale. What is highly manual today (think about an analyst combing thousand line spreadsheets), becomes automatic tomorrow (an easy button) through technology. If Machine Learning was first defined in 1959, why is this now the time to seize the opportunity? It’s the economics.

A relative graphic to explain:

Since the time that Machine Learning was defined and through the last decade, the application of Machine Learning was limited by the cost of compute and data acquisition/preparation. In fact, compute and data consumed the entirety of any budget for analytics which left zero investment for the real value driver: algorithms to drive actionable insights. In the last couple years, with cost of compute and data plummeting, machine learning is now available to anyone, for rapid application and exploitation.


It is well known that businesses must constantly adapt to changing conditions: competitors introduce new offerings, consumer habits evolve, and the economic and political environment change, etc. This is not new, but the velocity at which business conditions change is accelerating. This constantly accelerating pace of change places a new burden on technology solutions developed for a business.

Over the years, application developers moved from V shaped projects, with multi-year turnaround, to agile development methodologies ( turnaround in months, weeks, and often days). This has enabled businesses to adapt their application and services much more rapidly. For example:

a) A sales forecasting system for a retailer: The forecast must take into account today's market trends, not just those from last month. And, for real-time personalization, it must account for what happened as recently as 1 hour ago.

b) A product recommendation system for a stock broker: they must leverage current interests, trends, and movements, not just last months.

c) A personalized healthcare system: Offerings must be tailored to an individual and their unique circumstance. Healthcare devices, connected via The Internet of Things (IoT), can be used to collect data on human and machine behavior and interaction.

These scenarios, and others like them, create a unique opportunity for machine learning. Indeed, machine learning was designed to address the fluid nature of these problems.

Firstly, it moves application development from programming to training: instead of writing new code, the application developer trains the same application with new data. This is a fundamental shift in application development, because new, updated applications can be obtained automatically on a weekly, if not daily basis. This shift is at the core of the cognitive era in IT.

Secondly, machine learning enables the automated production of actionable insights where the data is (i.e. where business value is greatest). It is possible to build machine learning systems that learn from each user interaction, or from new data collected by an IoT device. These systems then produce output that takes into account the latest available data. This would not be possible with traditional IT development, even if agile methodologies were used.


While most companies get to the point of understanding machine learning, too few are turning this into action. They are either slowed down by concerns over their data assets or they attempt it one-time and then curtail efforts, claiming that the results were not interesting. These are common concerns and considerations, but they should be recognized as items that are easily surmounted, with the right approach.

First, let’s take data. A common trap is to believe that data is all that is needed for successful machine learning project. Data is essential, but machine learning requires more than data. Machine learning projects that start with a large amount of data, but lack a clear business goal or outcome, are likely to fail. Projects that start with little or no data, yet have a clear and measurable business goal are more likely to succeed. The business goal should dictate the collection of relevant data and also guide the development of machine learning models. This approach provides a mechanism for assessing the effectiveness of machine learning models.

The second trap in machine learning projects is to view it as a one-time event. Machine learning, by definition, is a continuous process and projects must be operated with that consideration.

Machine learning projects are often run as follows:

1) They start with data and a new business goal.

2) Data is prepared, because it wasn’t collected with the new business goal in mind.

3) Once prepared, machine learning algorithms are run on the data in order to produce a model.

4) The model is then evaluated on new, unforeseen, data to see whether it captured something sensible from the data. If it does, then it is deployed in a production environment where it is used to make predictions on new data.

While this typical approach is valuable, it is limited by the fact that the models learn only once. While you may have developed a great model, changing business conditions may make it irrelevant. For instance, assume machine learning is used to detect anomaly in credit card transactions. The model is created using years of past transactions and anomalies are fraudulent transactions. With a good data science team and the right algorithms, it is possible to obtain a fairly accurate model. This model can then be deployed in a payment system where it flags anomalies when it detects them. Transactions with anomalies are then rejected. This is effective in the short term, but clever criminals will soon recognize that their scam is detected. They will adapt, and they will find new ways to use stolen credit card information. The model will not detect these new ways because they were not present in the data that was used to produce it. As a result, the model effectiveness will drop.

The cure to avoid this performance degradation is to monitor the effectiveness of model predictions by comparing them with actuals. For instance, after some delay, a bank will know which transactions were fraudulent or not. Then it is possible to compare the actual fraudulent transactions with the anomalies detected by the machine learning model. From this comparison one can compute the accuracy of the predictions. One can then monitor this accuracy over time and watch for drops. When a drop happens, then it is time to refresh the machine learning model with more up to date data. This is what we call a feedback loop. See here:

With a feedback loop, the system learns continuously by monitoring the effectiveness of predictions and retraining when needed. Monitoring and using the resulting feedback are at the core of machine learning. This is no different than how humans perform a new task. We learn from our mistakes, adjust, and act. Machine learning is no different.


Companies that are convinced that machine learning should be a core component of their analytics journey need a tested and repeatable model: a methodology. Our experience working with countless clients has led us to devise a methodology that we call DataFirst. It is a step-by-step approach for machine learning success.

Phase 1: The Data Assessment
The objective is to understand your data assets and verify that all the data needed to meet the business goal for machine learning is available. If not, you can take action at that point, to bring in new sources of data (internal or external), to align with the stated goal.

Phase 2: The Workshop
The purpose of a workshop goal is to ensure alignment on the definition and scope of the machine learning project. We usually cover these topics:
- Level set on what machine learning can do and cannot do
- Agree on which data to use.
- Agree on the metric to be used results evaluation
- Explore how the machine learning workflow, especially deployment and feedback loop, would integrate with other IT systems and applications.

Phase 3: The Prototype
The prototype aims at showing machine learning value with actual data. It will also be used to assess performance and resources needed to run and operate a production ready machine learning system. When completed, the prototype is often key to secure a decision to develop a production ready system.


Leaders in the Data era will leverage their assets to develop superior machine learning and insight, driven from a dynamic corpus of data. A differentiated approach requires a methodical process and a focus on differentiation with a feedback loop. In the modern business environment, data is no longer an aspect of competitive advantage; it is the basis of competitive advantage.

iPad Pro: Going All-in

Here is my tweet from a few weeks back:

I have given it a go, going all-in with the iPad Pro. In short, I believe I have discovered the future of personal computing. That being said, in order to do this, you truly have to change the way you work; how you spend your time, how you communicate, etc. But, it's worth it and will probably make you a better professional. I knew I was hooked, when I had to go back to my MacBook for something and I started touching the screen; the touch interface had been ingrained in my work.

Here are my quick observations:

1) The speed of the iPad Pro is unbelievable. While I didn't realize this in advance, this fact alone makes up for a lot of the reasons why I could never move to an iPad before.

2) You have to master multitasking in the iPad Pro in order to make the switch. There are a lot of shortcuts on the screen, keyboard shortcuts, and hand gestures. If you are not using them, you will not understand the advantage of this form factor.

3) Keyboard shortcuts are now available for my corporate mail. That's a big time saver.

4) I never have to worry about a power cable. The battery on this is great, but even if it gets low, nearly everyone I know has a compatible charger.

5) The integration of apps on the Pro is tremendous: Box/Office, Slack, etc.

6) It goes without saying that the Pro is super light and convenient for travel.

7) Here are some things I can't do on the iPad Pro:
- Renew Global Entry
- Corporate workflow (forms and expenses)
- Blogging (writing is easy, but posting to corporate blog or even Blogger is very hard). I'm not sure why there is not a good app for this.

8) I got the smaller version of the iPad Pro. I thought the large one was just too big. It seems like the ideal size may be a size in between the two.

In short, after a few weeks, I highly recommend. You can make the switch, but you'll likely need a laptop once a week or so, for some of the items mentioned above. I haven't really gotten into the Apple Pencil yet. I've used it a couple times and may try it more over the next couple weeks.

Data Science is a Team Sport

In 2013, Ron Howard directed and released the movie Rush, a film that captured the rivalry between James Hunt and Niki Lauda during the 1976 Formula One racing season. It’s a vivid portrait of the drivers and their personalities—a pretty typical, if captivating focus on the drivers as heroes of the race. But it does something deeper and more interesting as well. The film looks into the essence of Formula One—a true team sport.

“Formula” in Formula One refers to the set of rules to which all participants' cars must conform. Formula One rules were agreed upon in 1946, on the heels of World War II. Modern Formula One cars are open cockpit, single-seat vehicles. The cornering speed of a car comes from “wings” mounted at the front and rear of the vehicle. The tires also play a major role in the cornering speed of a car. Carbon disc brakes are used to increase performance. Engines have evolved to turbocharged V6’s. All these components are integrated to provide precision and performance, and to win the race. However, the precision and design of the vehicle is useless, without the right team.

In Formula One, an “entrant” is the person who registers a car and driver for the race, and maintains the vehicle. The “constructor” is the person who builds the engine or chassis and owns the intellectual rights to the design. The “pit crew” is the team that prepares and maintains the vehicle before, during, and after the race. The cameras focus on the driver, with a couple of obligatory shots of the pit crew scrambling to change tires. But the real story is the collaboration of the complete team: experts working together to make the difference between success and failure.


Since the turn of the century, enterprises around the world have been on a journey to master data science and analytics. We have fewer camera crews, and no cool uniforms, but the goal is no less difficult to achieve. Said simply, we want the right information, at the right moment, to make better decisions. Despite years of effort, organizations have achieved inconsistent results. Some are building competitive moats with machine learning on a large corpus of data, but others are only reducing their costs by 3%, using some new tools. This is best viewed on an enterprise maturity curve:

Why are some organizations able to achieve differentiated results, while others struggle to set up a Hadoop cluster?


Spark is the Analytics Operating System for the modern enterprise. Anyone using data, starting right now, will be leveraging Spark. Spark enables universal access to data in an organization.

Today, we are announcing the Data Science Experience, the first enterprise app available for the Analytics Operating System. This is the first integrated development environment for real-time, high performance Analytics, designed to blend emerging data technologies and machine learning into existing architectures.

An IDE for data science is a collaborative environment; it brings data scientists together to make data science and machine learning available to everyone. Today, data science is an individual sport. If you are a data scientist at a retailer, for example, you have to choose your own tool or flavor, work on your own, and, with any luck, you produce a meaningful insight. Anything you learn stays with you—it’s self-contained, because it is built in your own lingua-franca.

Now, with the Data Science Experience, you can use any language you want—R, Python, Scala, etc.—and share your models with other data scientists in your organization.

We have made data science a team sport.

In Formula One parlance, Spark is the chassis, holding everything together. The Data Science Experience (the IDE) is the integrated components, acting as one, to drive precision and performance. And the data science discipline now has a driver, a pit crew, a constructor, and a coach, that incredible vehicle whose sum is greater than its parts: a team.

The Data Science Experience is born on the cloud. It adapts to open source innovation. And the Data Science Experience grows stronger as more and more data scientists around the globe create solutions based on Spark. Further, the ecosystem for The Data Science Experience is open and available. We are proud to have partners like H20, RStudio, Lightbend, and Galvanize, to name a few.

With Data Science Experience, the discipline of data science can now accomplish exponentially greater outcomes. It’s the difference between a shiny car sitting in a garage, and crossing the finish line at 230 miles per hour.


IBM is building the next generation analytics platform in the cloud.

1. It started with our investment in Apache Spark as the Analytics O/S, last year.
2. It continues today, as we launch the first IDE for this new way of thinking about data & analytics.
3. Over time, this will evolve as the platform for an enterprise in the data era.

All of this is enabled by Spark.


In June 2015, we announced IBM’s commitment to Apache Spark. In closing, I want to provide some context on our progress in the last year. If you missed it last year, here is why I believe Spark is will be a critical force in technology, on the same scale as Linux.

So, what have we accomplished and where are we going?

1) We continue to expand the Spark Technology Center (STC). We opened an STC in India. We continue to hire aggressively. And, later this year, we will move into our new home on Howard St. in San Francisco.

2) Client traction and response has been phenomenal. We have 40+ client references already and more on the way.

3) We have open sourced SystemML as promised and we are working on it with the community, in the open. This contribution is over 100,000 lines of code. SystemML was accepted into Apache as an official Incubator project as of November 2015. Since it was open-sourced, 859 contributions have been made to the project (i.e. a build-out of the Spark backend, API improvements; usability with Scala Spark & PySpark notebooks for data science, experimental work into deep learning, etc.)

4) For Spark 1.6.x, a total of 29 team members contributed to the release (26 of them from the STC), and each contributing engineer is a credited contributor in the release notes of Spark 1.6.x. For Spark 2.0, 31 STC developers have contributed to Spark 2.0 thus far. This is still in progress

5) Our Spark specific JIRAs have been almost 25,000 lines of code. You can watch them in action here. Much of our focus has been on SQL, MLlib, and PySpark.

6) We launched the Open Source Analytics Ecosystem and are working closely with partners like Databricks, Lightbend, RStudio, H20, and many others. We welcome all.

7) We have trained ~400,000 data scientists through a number of forums, including

8) Adoption of the Spark Service on IBM Cloud continues to grow exponentially, as users seek access to the Analytics Operating System.

9) We have over 30 IBM products that are leveraging Spark and many more in the pipeline.

10) We launched a Spark newsletter. Anyone can subscribe here.

11) Lastly, we have launched a Spark Advisory Council. Over 25 leading enterprises and partners — Spark experts building new companies and established industry leaders building new platforms — participate in this regular dialogue about their experiences with Spark and the direction of the Spark project. We use this thinking to focus our efforts in the Spark Technology Center. All are welcome. Contact us here if you are interested.


Data Science is a team sport. Spark is the enabler. This is why I stated last year that anyone using data will be leveraging Spark in the future. That future is quickly arriving.

Winning in Formula One is about speed, performance, precision, and collaboration. Those that find the winners circle have found a way to integrate the components (human and material) to act as ONE. The same opportunity exists in Analytics and Data Science. Let’s make data science a team sport. Welcome to the first enterprise app for the Analytics Operating System: The Data Science Experience.

The Fortress Cloud

In 1066, William of Normandy assembled an army of over 7,000 men and a fleet of over 700 ships to defeat England's King Harold Godwinson and secure the English throne. King William, recognizing his susceptibility to attack, immediately constructed a network of castles to preserve his kingdom and improve his status among followers. 

The word 'castle' comes from the Latin word castellum, which means 'fortress.' While Medieval castles evolved in structure and function through the years, their core role has not changed:

1. To protect, as a defensive measure.
2. A platform to wage battle, as an offensive measure.
3. To ensure orderly governance. 

Medieval castles were well planned in terms of their location and several key attributes. They were built near or on a water spring. They had direct access to key transportation routes and were built on high ground to make defending the stronghold a bit easier. 

I have written extensively about The Big Data Revolution, researching how digital technologies and data exploitation are impacting industries in the Data era. While every industry is different, there are clear patterns in how data is reinventing business processes and disrupting traditional business models. Most notable is that the Revolution cannot be effectively waged without the right protection, foundation for an offensive, and orderly governance. We need a modern day castle for The Big Data Revolution; a fortress cloud.


The first wave of big data has hit, creating great opportunities, but also cracks in company security, worries about customer data privacy, and showing the limitations of current analytics. Perhaps the Big Data Maturity curve captures it best:

Most of the investments to date have been focused on cost reduction and extending existing IT capabilities. We are now entering an era that will be marked by business re-invention on the basis of data. Incumbents beware. This demands a thoughtful approach on security measures companies may have to take, how improved analytics can help all achieve stronger insights, and how consumers are demanding a new privacy contract. 

The traditional IT stack is giving way to a fluid data layer: a new set of composable cloud services, defined by next generation capabilities. With this new approach to analytics, we must re-imagine all aspects of data movement and governance for that world. I see 3 defining capabilities:

1. Ingest- The ability to lift data from wherever it resides and integrate it into a cloud-based fluid data layer. This must be done seamlessly and at incredibly high speeds, with little to no manual intervention. 
2. Preparation- The ability to massage, filter, and select only the data most relevant to the task at hand. 
3. Governance- The ability to catalog, describe (metadata), and manage access to sensitive data sets.

Companies will require a new approach to data integration, data preparation, data governance, and data pipelining; a modern day fortress, on the cloud, ready for The Big Data Revolution.


The Basel Committee on Banking Supervision (BCBS) announced regulation 239 in January 2013. For many institutions, this immediately put them on the defensive. However, Sun Tzu reminds us, "Security against defeat implies defensive tactics; ability to defeat the enemy means taking the offensive." BCBS 239, for the data-era organizations, represents an opportunity for an offensive.

For those less familiar, the principles of BCBS 239 center on governance, data and IT architecture, accuracy, timeliness, and completeness in reporting, when it comes to an organizations data assets and processes. While these may appear to be defensive measures, the endgame is a platform from which to wage battle: a true governance offensive.

With the right data architecture, established on the cloud, a new set of opportunities emerge for an enterprise that embraces governance as an offensive measure. An enterprise will find itself with a castle for the Data era, armed with key offensive weapons:

a) Self-Service: designed to empower the citizen analyst, data engineer, and data steward to engage on their own accord. A user does not need to ask for access to data; they simply engage and discover.

b) Hybrid: taps into data everywhere...ground to cloud. Where the data resides does not matter to the consumer/user; it’s just data.

c) Intelligent: embedded analytics makes everyone a super human and automates many manual processes.

d) All Data: works with both structured & unstructured data

These are the principles that have guided the construction of our fortress destination on the cloud. This is IBM DataWorks.


When William of Normandy assembled his fortress many years ago, he adorned his castles with a number of attributes: towers, curtain walls, moats, drawbridges, portcullis, etc. All were a set of best practices designed for defensive protection, coupled with a base from which to wage an offensive. It was modern protection, for an unmodern time.

Our fortress cloud, like that of William of Normandy, is designed around governance as a strategic lever: offensive and defensive. It’s a unique destination, for our modern time.


Special thanks to @danhernandezATX for editing and guidance.

Machine Learning and The Big Data Revolution

I had the opportunity to speak at TDWI in Chicago today. It was a tremendous venue and a well organized event. Thanks to the TDWI team. I spoke on the topic of machine learning and the big data revolution. The slides are below, although they are not all self-explanatory.

3 key points from the talk:

Scale Effects
In the 20th century, scale effects in business were largely driven by breadth and distribution. A company with manufacturing operations around the world had an inherent cost and distribution advantage, leading to more competitive products. A retailer with a global base of stores had a distribution advantage that could not be matched by a smaller company. These scale effects drove competitive advantage for decades. The Internet changed all of that.

In the modern era, there are three predominant scale effects:

-Network: lock-in that is driven by a loyal network (Facebook, Twitter, Etsy, etc.)
-Economies of Scale: lower unit cost, driven by volume (Apple, TSMC, etc.)
-Data: superior machine learning and insight, driven from a dynamic corpus of data

The Big Data Maturity curve

This is the barometer for any enterprise seeking competitive advantage, based on data. Many companies are beginning to utilize new techniques to reduce the cost of data infrastructure. But, the competitive breakthrough comes when an enterprise moves to the right side of the curve: Line of business analytics to transform operations and new business imperatives and business models. I alluded to a number of companies that I admire for leading on this side of the curve: CoStar, StitchFix, and Monsanto.


A proven and repeatable methodology for applying the value of data science and machine learning in the context of an enterprise. With thousands of successful engagements, we have learned a lot about what works (and what does not). I've seen companies achieve major breakthroughs leveraging this methodology, often ending months/years of frustration. Any organization can lead the revolution with AnalyticsFirst. Let me know if you are interested!

12 Attributes of a Great Leader

"A managers output is the output of the organization under her supervision or influence." - Andy Grove

I believe that most managers want to be great managers. In fact, many aspire to transcend management and to be deemed leaders. While there are countless books on the topic, sometimes they are too much theory and not enough practice, to be relevant and applicable. One of the main roles of a leader is to teach; through actions of commission, actions of omission, and through a thoughtful dialogue. The goal of this series is to share what I believe are the hallmarks of great management.

In High Output Management, Andy Grove explores why, at times, an individual is not able to achieve their potential in a job. He simplifies it to one of 2 reasons: 1) they are incapable, or 2) they are not motivated. In either case, it's the responsibility of the manager to assess and remediate the situation. This is not comfortable, nor easy. Hence, this is why great leadership is difficult.

I will focus on what I think are the 12 defining attributes of a great leader:

1) Team builder- assembling and motivating teams.
2) Running teams- a disciplined management system, based on thoughtful planning.
3) Expectations, Accountability, and Empowerment - the #1 issue I see is here.
4) Being on offense, not defense- leading instead of reacting.
5) Engagement and influence- creating informal influence broadly.
6) Operational rigor- managing the details, without micro-managing.
7) Clear and candid communication - never leaving a gray area.
8) Training- a critical role of a manager.
9) Mental toughness- never talked about enough, yet many managers fail due to this aspect alone.
10) Strategic thinking- having a point of view, differentiated and right.
11) Obsessing over clients- knowing who pays the bills and applying it to every decision.
12) Positive attitude- Motivating by example.

I'll cover each topic via blog and/or podcast.


2) Running teams-

The hardest thing for a new manager/leader to adjust to is being the pace setter. Once you assume the role of a leader, your job is to be on offense, not defense. I see even the greatest individual contributors struggle with this at times, because their success has been defined by doing everything that is asked of them. However, once you assume the manager role, you must become the one setting the direction and sparking activity. And, it can't just be activity for activity sake; it has to be thoughtful, pointed, and focused. This is the notion of a thoughtful management system.

In this podcast, I am joined by a great leader, Derek Schoettle, who was the CEO of Cloudant, before joining IBM via acquisition. We discuss how effective managers run teams, set pace, and foster open communication. 3 major topics are covered:

1) Committing to a course: No sudden, jerky movements and how to establish consistency in communication patterns.

2) The Rockefeller Habits: Set priorities (1-5), manage key metrics/data, and establish a rhythm.

3) Conducting 1-on-1's: Using formal and informal approaches to communicate for impact.

I hope you enjoy the podcast.


5) Engagement and influence-

"Great leaders are relaxed when the team is stressed, and stressed when the team is relaxed."

I had a chance to talk with Jerome Selva about Engagement and Influence recently.

Podcast here.

We discuss:

- Informal Influence
- Getting comfortable in your own skin
- Tools for informal influence (blogs, videos, etc.)
- Looking outside your defined scope
- Emotional intelligence

In addition, Jerome shared the following for further reading:

Travis Bradberry and Jean Greaves — "Emotional Intelligence 2.0"

Emily Sterrett — “The Manager’s Pocket Guide to Emotional Intelligence”

Daniel Goleman "What makes a leader"

HBR article:

EI test:


7) Clear and candid communication -
I tend to say whatever is on my mind, as succinctly as possible. I believe it provides clarity (even if it’s not agreed with) and clarity leads to speed. Hence, I’ve always leaned towards saying exactly what I am thinking. I’ve had more than one person tell me, “you say the things that other people are thinking.”.

Now, that’s my style. It doesn’t mean it’s the right style or the only style. Everyone is different and should communicate in a manner that fits their style. That being said, I think one hallmark of leadership and management is being able to have the candid conversations and if necessary, delivering the uncomfortable truth.

In this podcast, I am joined by a colleague, Ritika Gunnar, to discuss the topic of Candid Communication as a manager and leader. Our conversation focuses in 3 areas:

1) Sharpening contradictions: the best managers identify disagreement in their team and tease it out. They know that letting it persist can create an unhealthy culture. It’s much better to get it on the table, even if it leads to a difficult discussion, than to let it lie in the background.

2) Don’t let problems linger: if you have a challenge with someone or something, speak up…put it on the table. If you let it linger silently, frustration and anxiety build and the trust amongst a team deteriorates over time.

3) Giving feedback: for many managers, it is very hard to give candid feedback, especially when it is negative or potentially confrontational. I believe that at their core, everyone wants to know the truth and where they stand. So, we discuss some techniques for how to deliver the harder messages. Your teams will thank you for it (sometimes many years down the road).

I hope you enjoy the podcast.

"In classical times, when Cicero had finished speaking, the people said, 'How well he spoke', but when Demosthenes had finished speaking, they said, 'Let us march'"- Adlai Stevenson

Pattern Recognition

Elements of Success Rhyme

The science of pattern recognition has been explored for hundreds of years, with the primary goal of optimally extracting patterns from data or situations, and effectively separating one pattern from another. Applications of pattern recognition are found everywhere, whether it’s categorizing disease, predicting outbreaks of disease, identifying individuals (through face or speech recognition), or classifying data. In fact, pattern recognition is so ingrained in many things we do, we often forget that it’s a unique discipline which must be treated as such if we want to really benefit from it.

According to Tren Griffin, a prominent blogger and IT executive, Bruce Dunlevie, a general partner at the venture capital rm Benchmark Capital, once said to him, “Pattern recognition is an essential skill in venture capital.” Griffin elaborates the point Dunlevie was making that “while the elements of success in the venture business do not repeat themselves precisely, they often rhyme. In evaluating companies, the successful VC will often see something that reminds them of patterns they have seen before.” Practical application of pattern recognition for business value is difficult. The great investors have a keen understanding of how to identify and apply patterns.

Pattern Recognition: A Gift or a Trap?

Written in 2003 by William Gibson, Pattern Recognition (G.P. Putnam’s Sons) is a novel that explores the human desire to synthesize patterns in what is otherwise meaningless data and information. The book chronicles a global traveler, a marketing consultant, who has to unravel an Internet-based mystery. In the course of the book, Gibson implies that humans find patterns in many places, but that does not mean that they are always relevant. In one part of the book, a friend of the marketing consultant states, “Homo sapiens are about pattern recognition. Both a gift and a trap.” The implication is that humans find some level of comfort in discovering patterns in data or in most any medium, as it helps to explain what would otherwise seem to be a random occurrence. The trap comes into play when there is really not a pattern to be discovered because, in that case, humans will be inclined to discover one anyway, just for the psychological comfort that it affords.

Patterns are useful and meaningful only when they are valid. The bias that humans have to find patterns, even if patterns don’t exist, is an important phenomenon to recognize, as that knowledge can help to tame these natural biases.

Tsukiji Market

The seafood will start arriving at Tsukiji before four in the morning, so an interested observer must start her day quite early. The market will see 400 different species passing through on any given day, eventually making their way to street carts or the most prominent restaurants in Tokyo. The auction determines the destination of each delicacy. In any given year, the fish markets in Tokyo will handle over 700 metric tons of seafood, representing a value of nearly $6 billion.

The volume of species passing through Tsukiji represents an interesting challenge in organizing and classifying the catch of the day. In the 2001 book Pattern Classification (Wiley), Richard Duda provided an interesting view of this process, using fish as an example.

With a fairly rudimentary example — fish sorting — Duda is able to explain a number of key aspects of pattern recognition.

A worker in a fish market, Tsukiji or otherwise, faces the problem of sorting fish on a conveyor belt according to their species. This must happen over and over again, and must be done accurately to ensure quality. In Duda’s simple example in the book, it’s assumed that there are only two types of fish: sea bass and salmon.

As the fish come in on the conveyor belt, the worker must quickly determine and classify the fishes’ species.

There are many factors that can distinguish one type of fish from another. It could be the length, width, weight, number and shape of fins, size of head or eyes, and perhaps the overall body shape.
There are also a number of factors that could interrupt or negatively affect the process of distinguishing (sensing) one type from the other. These factors may include the lighting, the position of the fish on the conveyor belt, the steadiness of the photographer taking the picture, and so on.

The process, to ensure the most accurate determination, consists of capturing the image, isolating the fish, taking measurements, and making a decision. However, the process can be enhanced or complicated, based on the number of variables. If an expert fisherman indicates that a sea bass is longer than salmon, that’s an important data point, and length becomes a key feature to consider. However, a few data points will quickly demonstrate that while sea bass are longer than salmon on average, there are many examples where that does not hold true. Therefore, we cannot make an accurate determination of fish type based on that factor alone.

With the knowledge that length cannot be the sole feature considered, selecting additional features becomes critical. Multiple features — for example, width and lightness — start to give a higher- confidence view of the fish type.

Duda defines pattern recognition as the act of collecting raw data and taking an action based on the category of the pattern. Recognition is not an exact match. Instead, it’s an understanding of what is common, which can be expanded to conclude the factors that are repeatable.

A Method for Recognizing Patterns

Answering the three key questions (what is it?, where is it?, and how it is constructed?) seems straightforward — until there is a large, complex set of data to be put through that test. At that point, answering those questions is much more daunting. Like any difficult problem, this calls for a process or method to break it into smaller steps. In this case, the method can be as straightforward as five steps, leading to conclusions from raw inputs:

1. Data acquisition and sensing: The measurement and collection of physical variables.

2. Pre-processing: Extracting noise in data and starting to isolate patterns of interest. In the fish example given earlier in the chapter, you would isolate the fish from each other and from the background. Patterns are well separated and not overlapping.

3. Feature extraction: Finding a new representation in terms of features. For the fish, you would measure certain features.

4. Classification: Utilizing features and learned models to assign a pattern to a category. For the fish, you would clearly identify the key distinguishing features (length, weight, etc.).

5. Post-processing: Assessing the confidence of decisions, by leveraging other sources of information or context. Ultimately, this step allows the application of content-dependent information, which improves outcomes.

Pattern recognition techniques find application in many areas, from machine learning to statistics, from mathematics to computer science. The real challenge is practical application. And to apply these techniques, a framework is needed.

Elements of Success Rhyme (continued)

Pattern recognition can be a gift or a trap.

It’s a trap if a person is lulled into believing that history repeats itself and therefore there is simply a recipe to be followed. This is lazy thinking, which rarely leads to exceptional outcomes or insights.

On the other hand, it’s a gift to realize that, as mentioned in this chapter’s introduction, the elements of success rhyme. Said another way, there are commonalities between successful strategies in businesses or other settings. And the proper application of a framework or methodology to identify patterns and to understand what is a pattern and what is not can be very powerful.

The inherent bias within humans will seek patterns, even where patterns do not exist. Understanding a pattern versus the presence of a bias is a differentiator in the Data era. Indeed, big data provides a means of identifying statistically significant patterns in order to avoid these biases.

This post is adapted from the book, Big Data Revolution: What farmers, doctors, and insurance agents teach us about discovering big data patterns, Wiley, 2015. Find more on the web at

Ubuntu: A New Way to Work

“Teamwork and intelligence wins championships.” — Michael Jordan

There was an anthropologist dispatched to Africa many years ago to study the lives and customs of local tribes. While each one is unique, they share many customs across the geographies and locations. The anthropologist tells a story of how one time he brought along a large basket of candy, which quickly got the attention of all the children in the tribe. Instead of just handing it out, he decided to play a game. He sat the basket of candy under a tree and gathered all of the children about 50 yards away from the tree. He informed them that they would have a race, and that the first child to get there could keep all of the candy to themselves. The children lined up, ready for the race. When the anthropologist said “Go”, he was surprised to see what happened: all of the children joined hands and moved towards the tree in unison. When they got there, they neatly divided up the candy and sat down to enjoy it together. When he questioned why they did this, the children responded, “Ubuntu. How could any of us be happy if all the others were sad.”

Nelson Mandela describes it well; “In Africa, there is a concept known as Ubuntu- the profound sense that we are human only through the humanity of others; that if we are to accomplish anything in this world it will in equal measure be due to the work and achievements of others.”


Read the rest on Medium.

Decentralized Analytics for a Complex World

In 2015, General Stan McChrystal published Team of Teams, New Rules of Engagement For a Complex World. It was the culmination of his experience in adapting to a world that had changed faster than the organization that he was responsible to lead. When he assumed command for the Joint Special Operations Task Force in 2003, he recognized that their typical approaches to communication were failing. The enemy was a decentralized network that could move very quickly and accordingly, none of his organizations traditional advantages (equipment, training etc) mattered.

He saw the need to re-organize his force as a network, combining transparent communication with decentralized decision-making authority. Said another way, decisions should be made at the lowest level possible, as quickly as possible, and then, and only then, should data flow back to a centralized point. Information silos were torn down and data flowed faster, as the organization became flatter and more flexible.

Observing that the world is changing faster than ever, McChrystal recognized that the endpoints were the most valuable and the place that most decision making should take place. This prompted the question:

What if you could combine the adaptability, agility, and cohesion of a small team with the power and resources of a giant organization?


As I work with organizations around the world, I see a similar problem to the one observed by General McChrystal: data and information are locked into an antiquated and centralized model. The impact is that the professionals in most organizations do not have the data they need, in the moment it is required, to make the optimal decision. Even worse, most investments around Big Data today are not addressing this problem, as they are primarily focused on reducing the cost of storage or simply augmenting traditional approaches data management. Enterprises are not moving along the Big Data maturity curve fast enough:

While its not life or death in most cases, the information crisis in organizations is reaching a peak. Companies have not had a decentralized approach to analytics, to complement their centralized architecture. Until now.


Today, we are announcing Quarks. An open source, lightweight, embedded streaming analytics runtime, designed for edge analytics. It can be embedded on a device, gateway, or really anywhere, to analyze events locally, on the edge. For the first time ever, analytics will be truly decentralized. This will shorten the window to insights, while reducing communication costs by only sending the relevant events back to a centralized location. What General McChrystal did to modernize complex field engagements, we are doing for analytics in the enterprise.

While many believe that the Internet of Things (IoT) may be over-hyped, I would assert the opposite; we are just starting to realize the enormous potential of a fully connected world. A few data points:

1) $1.7 trillion of value will be added to the global economy by IoT in 2019. (source: Business Insider)
2) The world will grow from 13 billion to 29 billion connected devices by 2020. (source: IDC)
3) 82% of enterprise decision makers say that IoT is strategic to their enterprise. (source: IDC)
4) While exabytes of IoT data are generated every day, 88% of it goes unused. (Source: IBM Research)

Despite this obvious opportunity, most enterprises are limited by the costs and time lag associated with transmitting data for centralized analysis. To compound the situation, data streams from IoT devices are complex, and there is little ability to reuse analytical programs. Lastly, 52% of developers working on IoT are concerned that existing tools do not meet their needs (source: Evans Data Corporation). Enter, the value of open source.


Quarks is a programming model and runtime for analytics at the edge. It includes a programming SDK, a lightweight and embeddable runtime, and is open source (incubation proposal), available on GitHub.

This gives data engineers what they want:

− Easy access to IoT data streams
− Integrated data at rest with IoT data streams
− Curated IoT data streams
− The ability to make IoT data streams available for key stakeholders

This gives data developers what they want:
− Access to IoT data streams through APIs
− The ability to deploy machine learning, spatial temporal and other deep analytics on IoT data streams
− Familiar programming tools like Java or Python to work with IoT data streams
− The ability to analyze IoT data streams to build cognitive applications

Analytics at the edge is finally available to everyone, starting today, with Quarks. And, the use cases are extensive. For example, in 2015, Dimension Data became the official technology partner for the Tour de France, the worlds largest and most prestigious cycling race.

In support of their goal to revolutionize the viewing experience of billions of cycling fans across the globe, Dimension Data leveraged IBM Streams to analyze thousand of data points per second, from over 200 riders, across 21 days of cycling.

The potential of embedding Quarks in connected devices, on the network edge (essentially on each bike) would enable a new style of decentralized analytics: detecting in real-time, critical race events as they happen (a major rider crash for example), rather than having to infer these events from location and speed data alone. With the ability to analyze data at the end point, that data stream can then be integrated with Kafka, etc. and moved directly into Hadoop for storage or Spark for analytics. This will drive analytics at a never before seen velocity in enterprises.


We live in a world of increasing complexity and speed. As General McChrystal described, organizations that rely solely on centralized architectures for decision making and information flow will fail. At IBM, we are proud to lead the decentralization of analytics, complementing centralized architectures, as a basis for Cognitive Computing.