Monday, May 16, 2016

The Fortress Cloud

In 1066, William of Normandy assembled an army of over 7,000 men and a fleet of over 700 ships to defeat England's King Harold Godwinson and secure the English throne. King William, recognizing his susceptibility to attack, immediately constructed a network of castles to preserve his kingdom and improve his status among followers. 

The word 'castle' comes from the Latin word castellum, which means 'fortress.' While Medieval castles evolved in structure and function through the years, their core role has not changed:

1. To protect, as a defensive measure.
2. A platform to wage battle, as an offensive measure.
3. To ensure orderly governance. 

Medieval castles were well planned in terms of their location and several key attributes. They were built near or on a water spring. They had direct access to key transportation routes and were built on high ground to make defending the stronghold a bit easier. 

I have written extensively about The Big Data Revolution, researching how digital technologies and data exploitation are impacting industries in the Data era. While every industry is different, there are clear patterns in how data is reinventing business processes and disrupting traditional business models. Most notable is that the Revolution cannot be effectively waged without the right protection, foundation for an offensive, and orderly governance. We need a modern day castle for The Big Data Revolution; a fortress cloud.


The first wave of big data has hit, creating great opportunities, but also cracks in company security, worries about customer data privacy, and showing the limitations of current analytics. Perhaps the Big Data Maturity curve captures it best:

Most of the investments to date have been focused on cost reduction and extending existing IT capabilities. We are now entering an era that will be marked by business re-invention on the basis of data. Incumbents beware. This demands a thoughtful approach on security measures companies may have to take, how improved analytics can help all achieve stronger insights, and how consumers are demanding a new privacy contract. 

The traditional IT stack is giving way to a fluid data layer: a new set of composable cloud services, defined by next generation capabilities. With this new approach to analytics, we must re-imagine all aspects of data movement and governance for that world. I see 3 defining capabilities:

1. Ingest- The ability to lift data from wherever it resides and integrate it into a cloud-based fluid data layer. This must be done seamlessly and at incredibly high speeds, with little to no manual intervention. 
2. Preparation- The ability to massage, filter, and select only the data most relevant to the task at hand. 
3. Governance- The ability to catalog, describe (metadata), and manage access to sensitive data sets.

Companies will require a new approach to data integration, data preparation, data governance, and data pipelining; a modern day fortress, on the cloud, ready for The Big Data Revolution.


The Basel Committee on Banking Supervision (BCBS) announced regulation 239 in January 2013. For many institutions, this immediately put them on the defensive. However, Sun Tzu reminds us, "Security against defeat implies defensive tactics; ability to defeat the enemy means taking the offensive." BCBS 239, for the data-era organizations, represents an opportunity for an offensive.

For those less familiar, the principles of BCBS 239 center on governance, data and IT architecture, accuracy, timeliness, and completeness in reporting, when it comes to an organizations data assets and processes. While these may appear to be defensive measures, the endgame is a platform from which to wage battle: a true governance offensive.

With the right data architecture, established on the cloud, a new set of opportunities emerge for an enterprise that embraces governance as an offensive measure. An enterprise will find itself with a castle for the Data era, armed with key offensive weapons:

a) Self-Service: designed to empower the citizen analyst, data engineer, and data steward to engage on their own accord. A user does not need to ask for access to data; they simply engage and discover.

b) Hybrid: taps into data everywhere...ground to cloud. Where the data resides does not matter to the consumer/user; it’s just data.

c) Intelligent: embedded analytics makes everyone a super human and automates many manual processes.

d) All Data: works with both structured & unstructured data

These are the principles that have guided the construction of our fortress destination on the cloud. This is IBM DataWorks.


When William of Normandy assembled his fortress many years ago, he adorned his castles with a number of attributes: towers, curtain walls, moats, drawbridges, portcullis, etc. All were a set of best practices designed for defensive protection, coupled with a base from which to wage an offensive. It was modern protection, for an unmodern time.

Our fortress cloud, like that of William of Normandy, is designed around governance as a strategic lever: offensive and defensive. It’s a unique destination, for our modern time.


Special thanks to @danhernandezATX for editing and guidance.

Wednesday, May 11, 2016

Machine Learning and The Big Data Revolution

I had the opportunity to speak at TDWI in Chicago today. It was a tremendous venue and a well organized event. Thanks to the TDWI team. I spoke on the topic of machine learning and the big data revolution. The slides are below, although they are not all self-explanatory.

3 key points from the talk:

Scale Effects
In the 20th century, scale effects in business were largely driven by breadth and distribution. A company with manufacturing operations around the world had an inherent cost and distribution advantage, leading to more competitive products. A retailer with a global base of stores had a distribution advantage that could not be matched by a smaller company. These scale effects drove competitive advantage for decades. The Internet changed all of that.

In the modern era, there are three predominant scale effects:

-Network: lock-in that is driven by a loyal network (Facebook, Twitter, Etsy, etc.)
-Economies of Scale: lower unit cost, driven by volume (Apple, TSMC, etc.)
-Data: superior machine learning and insight, driven from a dynamic corpus of data

The Big Data Maturity curve

This is the barometer for any enterprise seeking competitive advantage, based on data. Many companies are beginning to utilize new techniques to reduce the cost of data infrastructure. But, the competitive breakthrough comes when an enterprise moves to the right side of the curve: Line of business analytics to transform operations and new business imperatives and business models. I alluded to a number of companies that I admire for leading on this side of the curve: CoStar, StitchFix, and Monsanto.


A proven and repeatable methodology for applying the value of data science and machine learning in the context of an enterprise. With thousands of successful engagements, we have learned a lot about what works (and what does not). I've seen companies achieve major breakthroughs leveraging this methodology, often ending months/years of frustration. Any organization can lead the revolution with AnalyticsFirst. Let me know if you are interested!

Tuesday, May 10, 2016

12 Attributes of a Great Leader: #2 Running Teams

The hardest thing for a new manager/leader to adjust to is being the pace setter. Once you assume the role of a leader, your job is to be on offense, not defense. I see even the greatest individual contributors struggle with this at times, because their success has been defined by doing everything that is asked of them. However, once you assume the manager role, you must become the one setting the direction and sparking activity. And, it can't just be activity for activity sake; it has to be thoughtful, pointed, and focused. This is the notion of a thoughtful management system.

In this podcast, I am joined by a great leader, Derek Schoettle, who was the CEO of Cloudant, before joining IBM via acquisition. We discuss how effective managers run teams, set pace, and foster open communication. 3 major topics are covered:

1) Committing to a course: No sudden, jerky movements and how to establish consistency in communication patterns.

2) The Rockefeller Habits: Set priorities (1-5), manage key metrics/data, and establish a rhythm.

3) Conducting 1-on-1's: Using formal and informal approaches to communicate for impact.

I hope you enjoy the podcast.

Wednesday, May 4, 2016

12 Attributes of a Great Leader: #7 Clear and Candid Communication

I tend to say whatever is on my mind, as succinctly as possible. I believe it provides clarity (even if it’s not agreed with) and clarity leads to speed. Hence, I’ve always leaned towards saying exactly what I am thinking. I’ve had more than one person tell me, “you say the things that other people are thinking.”.

Now, that’s my style. It doesn’t mean it’s the right style or the only style. Everyone is different and should communicate in a manner that fits their style. That being said, I think one hallmark of leadership and management is being able to have the candid conversations and if necessary, delivering the uncomfortable truth.

In this podcast, I am joined by a colleague, Ritika Gunnar, to discuss the topic of Candid Communication as a manager and leader. Our conversation focuses in 3 areas:

1) Sharpening contradictions: the best managers identify disagreement in their team and tease it out. They know that letting it persist can create an unhealthy culture. It’s much better to get it on the table, even if it leads to a difficult discussion, than to let it lie in the background.

2) Don’t let problems linger: if you have a challenge with someone or something, speak up…put it on the table. If you let it linger silently, frustration and anxiety build and the trust amongst a team deteriorates over time.

3) Giving feedback: for many managers, it is very hard to give candid feedback, especially when it is negative or potentially confrontational. I believe that at their core, everyone wants to know the truth and where they stand. So, we discuss some techniques for how to deliver the harder messages. Your teams will thank you for it (sometimes many years down the road).

I hope you enjoy the podcast.

Thursday, April 14, 2016

12 Attributes of a Great Leader

"A managers output is the output of the organization under her supervision or influence." - Andy Grove

I believe that most managers want to be great managers. In fact, many aspire to transcend management and to be deemed leaders. While there are countless books on the topic, sometimes they are too much theory and not enough practice, to be relevant and applicable. One of the main roles of a leader is to teach; through actions of commission, actions of omission, and through a thoughtful dialogue. The goal of this series is to share what I believe are the hallmarks of great management.

In High Output Management, Andy Grove explores why, at times, an individual is not able to achieve their potential in a job. He simplifies it to one of 2 reasons: 1) they are incapable, or 2) they are not motivated. In either case, it's the responsibility of the manager to assess and remediate the situation. This is not comfortable, nor easy. Hence, this is why great leadership is difficult.

I will focus on what I think are the 12 defining attributes of a great leader:

1) Team builder- assembling and motivating teams.
2) Running teams- a disciplined management system, based on thoughtful planning.
3) Expectations, Accountability, and Empowerment - the #1 issue I see is here.
4) Being on offense, not defense- leading instead of reacting.
5) Engagement and influence- creating informal influence broadly.
6) Operational rigor- managing the details, without micro-managing.
7) Clear and candid communication - never leaving a gray area.
8) Training- a critical role of a manager.
9) Mental toughness- never talked about enough, yet many managers fail due to this aspect alone.
10) Strategic thinking- having a point of view, differentiated and right.
11) Obsessing over clients- knowing who pays the bills and applying it to every decision.
12) Positive attitude- Motivating by example.

I'll cover each topic via blog and/or podcast.

"In classical times, when Cicero had finished speaking, the people said, 'How well he spoke', but when Demosthenes had finished speaking, they said, 'Let us march'"- Adlai Stevenson

Monday, March 21, 2016

Pattern Recognition

Elements of Success Rhyme

The science of pattern recognition has been explored for hundreds of years, with the primary goal of optimally extracting patterns from data or situations, and effectively separating one pattern from another. Applications of pattern recognition are found everywhere, whether it’s categorizing disease, predicting outbreaks of disease, identifying individuals (through face or speech recognition), or classifying data. In fact, pattern recognition is so ingrained in many things we do, we often forget that it’s a unique discipline which must be treated as such if we want to really benefit from it.

According to Tren Griffin, a prominent blogger and IT executive, Bruce Dunlevie, a general partner at the venture capital rm Benchmark Capital, once said to him, “Pattern recognition is an essential skill in venture capital.” Griffin elaborates the point Dunlevie was making that “while the elements of success in the venture business do not repeat themselves precisely, they often rhyme. In evaluating companies, the successful VC will often see something that reminds them of patterns they have seen before.” Practical application of pattern recognition for business value is difficult. The great investors have a keen understanding of how to identify and apply patterns.

Pattern Recognition: A Gift or a Trap?

Written in 2003 by William Gibson, Pattern Recognition (G.P. Putnam’s Sons) is a novel that explores the human desire to synthesize patterns in what is otherwise meaningless data and information. The book chronicles a global traveler, a marketing consultant, who has to unravel an Internet-based mystery. In the course of the book, Gibson implies that humans find patterns in many places, but that does not mean that they are always relevant. In one part of the book, a friend of the marketing consultant states, “Homo sapiens are about pattern recognition. Both a gift and a trap.” The implication is that humans find some level of comfort in discovering patterns in data or in most any medium, as it helps to explain what would otherwise seem to be a random occurrence. The trap comes into play when there is really not a pattern to be discovered because, in that case, humans will be inclined to discover one anyway, just for the psychological comfort that it affords.

Patterns are useful and meaningful only when they are valid. The bias that humans have to find patterns, even if patterns don’t exist, is an important phenomenon to recognize, as that knowledge can help to tame these natural biases.

Tsukiji Market

The seafood will start arriving at Tsukiji before four in the morning, so an interested observer must start her day quite early. The market will see 400 different species passing through on any given day, eventually making their way to street carts or the most prominent restaurants in Tokyo. The auction determines the destination of each delicacy. In any given year, the fish markets in Tokyo will handle over 700 metric tons of seafood, representing a value of nearly $6 billion.

The volume of species passing through Tsukiji represents an interesting challenge in organizing and classifying the catch of the day. In the 2001 book Pattern Classification (Wiley), Richard Duda provided an interesting view of this process, using fish as an example.

With a fairly rudimentary example — fish sorting — Duda is able to explain a number of key aspects of pattern recognition.

A worker in a fish market, Tsukiji or otherwise, faces the problem of sorting fish on a conveyor belt according to their species. This must happen over and over again, and must be done accurately to ensure quality. In Duda’s simple example in the book, it’s assumed that there are only two types of fish: sea bass and salmon.

As the fish come in on the conveyor belt, the worker must quickly determine and classify the fishes’ species.

There are many factors that can distinguish one type of fish from another. It could be the length, width, weight, number and shape of fins, size of head or eyes, and perhaps the overall body shape.
There are also a number of factors that could interrupt or negatively affect the process of distinguishing (sensing) one type from the other. These factors may include the lighting, the position of the fish on the conveyor belt, the steadiness of the photographer taking the picture, and so on.

The process, to ensure the most accurate determination, consists of capturing the image, isolating the fish, taking measurements, and making a decision. However, the process can be enhanced or complicated, based on the number of variables. If an expert fisherman indicates that a sea bass is longer than salmon, that’s an important data point, and length becomes a key feature to consider. However, a few data points will quickly demonstrate that while sea bass are longer than salmon on average, there are many examples where that does not hold true. Therefore, we cannot make an accurate determination of fish type based on that factor alone.

With the knowledge that length cannot be the sole feature considered, selecting additional features becomes critical. Multiple features — for example, width and lightness — start to give a higher- confidence view of the fish type.

Duda defines pattern recognition as the act of collecting raw data and taking an action based on the category of the pattern. Recognition is not an exact match. Instead, it’s an understanding of what is common, which can be expanded to conclude the factors that are repeatable.

A Method for Recognizing Patterns

Answering the three key questions (what is it?, where is it?, and how it is constructed?) seems straightforward — until there is a large, complex set of data to be put through that test. At that point, answering those questions is much more daunting. Like any difficult problem, this calls for a process or method to break it into smaller steps. In this case, the method can be as straightforward as five steps, leading to conclusions from raw inputs:

1. Data acquisition and sensing: The measurement and collection of physical variables.

2. Pre-processing: Extracting noise in data and starting to isolate patterns of interest. In the fish example given earlier in the chapter, you would isolate the fish from each other and from the background. Patterns are well separated and not overlapping.

3. Feature extraction: Finding a new representation in terms of features. For the fish, you would measure certain features.

4. Classification: Utilizing features and learned models to assign a pattern to a category. For the fish, you would clearly identify the key distinguishing features (length, weight, etc.).

5. Post-processing: Assessing the confidence of decisions, by leveraging other sources of information or context. Ultimately, this step allows the application of content-dependent information, which improves outcomes.

Pattern recognition techniques find application in many areas, from machine learning to statistics, from mathematics to computer science. The real challenge is practical application. And to apply these techniques, a framework is needed.

Elements of Success Rhyme (continued)

Pattern recognition can be a gift or a trap.

It’s a trap if a person is lulled into believing that history repeats itself and therefore there is simply a recipe to be followed. This is lazy thinking, which rarely leads to exceptional outcomes or insights.

On the other hand, it’s a gift to realize that, as mentioned in this chapter’s introduction, the elements of success rhyme. Said another way, there are commonalities between successful strategies in businesses or other settings. And the proper application of a framework or methodology to identify patterns and to understand what is a pattern and what is not can be very powerful.

The inherent bias within humans will seek patterns, even where patterns do not exist. Understanding a pattern versus the presence of a bias is a differentiator in the Data era. Indeed, big data provides a means of identifying statistically significant patterns in order to avoid these biases.

This post is adapted from the book, Big Data Revolution: What farmers, doctors, and insurance agents teach us about discovering big data patterns, Wiley, 2015. Find more on the web at

Thursday, February 25, 2016

Ubuntu: A New Way to Work

“Teamwork and intelligence wins championships.” — Michael Jordan

There was an anthropologist dispatched to Africa many years ago to study the lives and customs of local tribes. While each one is unique, they share many customs across the geographies and locations. The anthropologist tells a story of how one time he brought along a large basket of candy, which quickly got the attention of all the children in the tribe. Instead of just handing it out, he decided to play a game. He sat the basket of candy under a tree and gathered all of the children about 50 yards away from the tree. He informed them that they would have a race, and that the first child to get there could keep all of the candy to themselves. The children lined up, ready for the race. When the anthropologist said “Go”, he was surprised to see what happened: all of the children joined hands and moved towards the tree in unison. When they got there, they neatly divided up the candy and sat down to enjoy it together. When he questioned why they did this, the children responded, “Ubuntu. How could any of us be happy if all the others were sad.”

Nelson Mandela describes it well; “In Africa, there is a concept known as Ubuntu- the profound sense that we are human only through the humanity of others; that if we are to accomplish anything in this world it will in equal measure be due to the work and achievements of others.”


Read the rest on Medium.