In an earlier post Towards the Data-Driven Business
, I talked about the various roles that data and intelligence can play in the business. But where do you start? In this post, I shall talk about the approach that I have developed and used in a number of large organizations.
To build a roadmap that takes you into the future from where you are today, you need three things.
an understanding of the present. This includes producing AS-IS models of your current (legacy) systems, what data have you got and how are you currently managing and using it. We need to know about the perceived pain points, not because we only want to fix the symptoms, but because these will help us build a consensus for change. Typically we find a fair amount of duplicated and inconsistent data, crappy or non-existent interfaces, slow process loops and data bottlenecks, and general inflexibility.
This is always complicated by the fact that there are already numerous projects underway to fix some of the problems, or to build additional functionality, so we need to understand how these projects are expected to alter the landscape, and in what timescale. It sometimes becomes apparent that these projects are not ideally planned and coordinated from a data management perspective. If we find overlapping or fragmented responsibility in some critical data areas, we may need to engage with programme management and governance to support greater consistency and synergy.
a vision of the future opportunities for data and intelligence (and automation based on these). In general terms, these are outlined in my earlier post. To develop a vision for a specific organization, we need to look at their business model - what value do they provide to customers and other stakeholders, how is this value delivered (as business services or otherwise), and how do the capabilities and processes of the organization and its partners support this.
For example, I worked with an organization that had done a fair amount of work on modelling their internal processes and procedures, but lacked the outside-in view. So I developed a business service architecture that showed how the events and processes in their customers' world triggered calls on their services, and what this implied for delivering a seamless experience to their customers.
Using a capability-based planning
approach, we can then look at how data, intelligence and automation could improve not only individual business services, processes and underlying capabilities, but also the coordination and feedback loops between these. For example in a retail environment, there are typically processes and capabilities associated with both Buying and Selling, and you may be able to use data and intelligence to make each of them more efficient and effective. But more importantly, you can improve the alignment between
Buying and Selling.
(In some styles of business capability model, coordination is shown explicitly as a capability in its own right, but this is not a common approach.)
The business model also identifies which areas are strategically important to the business. At one organization, when we mapped the IT costs against the business model, we found that a disproportionate amount of effort was being devoted to non-strategic stuff, and surprisingly little effort for the customer-facing (therefore more strategically important) activities. (A colour-coded diagram can be very useful in presenting such issues to senior management.)
Most importantly, we find that a lot of stakeholders (especially within IT) have a fairly limited vision about what is possible, often focused on the data they already have rather than the data they could or should have. The double-diamond approach to design thinking works here, to combine creative scenario planning with highly focused practical action. I've often found senior business people much more receptive to these kind of discussions than the IT folk.
We should then be able to produce a reasonably future-proof and technology independent TO-BE data and information architecture, which provides a loosely-coupled blueprint for data collection, processing and management.
, how to get from A to B. In a large organization, this is going to take several years. A complete roadmap cannot just be a data strategy, but will usually involve some elements of business process and organizational change, as well as application, integration and technology strategy. It may also involve outside stakeholders - for example, providing direct access to suppliers and business partners via portals and APIs, and sharing data and intelligence with them, while obtaining consent from data subjects and addressing any other privacy, security and compliance issues. There are always dependencies between different streams of activity within the programme as well as with other initiatives, and these dependencies need to be identified and managed, even if we can avoid everything being tightly coupled together.
Following the roadmap will typically contain a mix of different kinds of project. There may need to be some experimental ("pioneer") projects as well as larger development and infrastructure ("settler", "town planner") projects.
To gain consensus and support, you need a business case. Although different organizations may have different ways of presenting and evaluating the business case, and some individuals and organizations are more risk-averse than others, a business case will always involve an argument that the benefits (financial and possibly non-financial) outweigh the costs and risks.
Generally, people like to see some short-term benefits ("quick wins" or the dreaded "low-hanging fruit
") as well as longer-term benefits. A well-balanced roadmap spreads the benefits across the phases - if you manage to achieve 80% of the benefits in phase 1, then your roadmap probably wasn't ambitious enough, so don't be surprised if nobody wants to fund phase 2.
, you have to implement your roadmap. This means getting the funding and resources, kicking off multiple projects as well as connecting with relevant projects already underway, managing and coordinating the programme. It also means being open to feedback and learning, responding to new emerging challenges (such as regulation and competition), maintaining communication with stakeholders, and keeping the vision and roadmap alive and up-to-date.
What kinds of automation are there, and is there a natural progression from basic to advanced? Do the terms intelligent automation
and cognitive automation
actually mean anything useful, or are they merely vendor hype? In this blogpost, I shall attempt an answer.
The simplest form of automation is known as robotic automation or robotic process automation (RPA). The word robot (from the Czech word for forced labour, robota
) implies a pre-programmed response to a set of incoming events. The incoming events are represented as structured data, and may be held in a traditional database. The RPA tools also include the connectivity and workflow technology to receive incoming data, interrogate databases and drive action, based on a set of rules.
People talk about cognitive technology or cognitive computing, but what exactly does this mean? In its marketing material, IBM uses these terms to describe whatever features of IBM Watson they want to draw our attention to – including adaptability, interactivity and persistence – but IBM’s usage of these terms is not universally accepted.
I understand cognition
to be all about perceiving and making sense of the world, and we are now seeing man-made components that can achieve some degree of this, sometimes called Cognitive Agents
Cognitive agents can also be used to detect patterns in vast volumes of structured and unstructured data and interpret their meaning. This is known as Cognitive Insight
, which Thomas Davenport and Rajeev Ronanki refer to as “analytics on steroids”. The general form of the cognitive agent is as follows.
Cognitive agents can be wrapped as a service and presented via an API, in which case they are known as Cognitive Services
. The major cloud platforms (AWS, Google Cloud, Microsoft Azure) provide a range of these services, including textual sentiment analysis.
At the current state-of-the-art, cognitive services may be of variable quality. Image recognition may be misled by shadows, and even old-fashioned OCR may struggle to generate meaningful text from poor resolution images. – but of course human cognition is also fallible.
Meanwhile, one of the key characteristics of intelligence is adaptability – being able to respond flexibly to different conditions. Intelligence is developed and sustained by feedback loops – detecting outcomes and adjusting behaviour to achieve goals. Intelligent automation therefore includes a feedback loop, typically involving some kind of machine learning.
Complex systems and processes may require multiple feedback loops (Double-Loop or Triple-Loop Learning).
If we embed this automation into the Internet of Things, we can use sensors to perform the information gathering, and actuators to carry out the actions.
Now what happens if we put all these elements together?
This fits into a more general framework of human-computer intelligence, in which intelligence is broken down into six interoperating capabilities.
I know that some people will disagree with me as to which parts of this framework are called "cognitive" and which parts "intelligent". Ultimately, this is just a matter of semantics. The real point is to understand how all the pieces of cognitive-intelligent automation work together.
The Limits of Machine Intelligence
There are clear limits to what machines can do – but this doesn’t stop us getting them to perform useful work, in collaboration with humans where necessary. (Collaborative robots are sometimes called cobots
.) A well-designed collaboration between human and machine can achieve higher levels of productivity and quality than either human or machine alone. Our framework allows us to identify several areas where human abilities and artificial intelligence can usefully combine.
In the area of perception and cognition, there are big differences in the way that humans and machines view things, and therefore significant differences in the kinds of kinds of cognitive mistakes they are prone to. Machines may spot or interpret things that humans might miss, and vice versa. There is good evidence for this effect in medical diagnosis, where a collaboration between human medic and AI can often produce higher accuracy than either can achieve alone.
In the area of decision-making, robots can make simple decisions much faster, but may be unreliable with more complex or borderline decisions, so a hybrid “human-in-the-loop” solution may be appropriate.
Decisions that affect real people are subject to particular concern – GDPR specifically regulates any automated decision-making or profiling that is made without human intervention, because of the potential impact on people’s rights and freedoms. In such cases, the “human-in-the-loop” solution reduces the perceived privacy risk. In the area of communication and collaboration, robots can help orchestrate complex interactions between multiple human experts, and allow human observations to be combined with automatic data gathering. Meanwhile, sophisticated chatbots are enabling more complex interactions between people and machines.
Finally there is the core capability of intelligence – learning. Machines learn by processing vast datasets of historical data – but that is also their limitation. So learning may involve fast corrective action by the robot (using machine learning), with a slower cycle of adjustment and recalibration by human operators (such as Data Scientists). This would be an example of Double-Loop learning.
Some of the elements of this automation framework are already fairly well developed, with cost-effective components available from the technology vendors. So there are some modes of automation that are available for rapid deployment. Other elements are technologically immature, and may require a more cautious or experimental approach.
Your roadmap will need to align the growing maturity of your organization with the growing maturity of the technology, exploiting quick wins today while preparing the groundwork to be in a position to take advantage of emerging tools and techniques in the medium term.
Thomas Davenport and Rajeev Ronanki, Artificial Intelligence for the Real World
Related posts: Automation Ethics
(August 2019), RPA - Real Value or Painful Experimentation?
If we want to build a data-driven business, we need to appreciate the various roles that data and intelligence can play in the business - whether improving a single business service, capability or process, or improving the business as a whole. The examples in this post are mainly from retail, but a similar approach can easily be applied to other sectors.
Sense-Making and Decision Support
The traditional role of analytics and business intelligence is helping the business interpret and respond to what is going on.
Once upon a time, business intelligence always operated with some delay. Data had to loaded from the operational systems into the data warehouse before they could be processed and analysed. I remember working with systems that generated management infomation based on yesterday's data, or even last month's data. Of course, such systems don't exist any more, because people expect real-time insight, based on streamed data.
Management information systems are supposed to support individual and collective decision-making. People often talk about actionable intelligence
, but of course it doesn't create any value for the business until it is actioned
. Creating a fancy report or dashboard isn't the real goal, it's just a means to an end.
Analytics can also be used to calculate complicated chains of effects on a what-if
basis. For example, if we change the price of this product by this much, what effect is this predicted to have on the demand for other products, what are the possible responses from our competitors, how does the overall change in customer spending affect supply chain logistics, do we need to rearrange the shelf displays, and so on. How sensitive is Y to changes in X, and what is the optimal level of Z?
Analytics can also be used to support large-scale optimization - for example, solving complicated scheduling problems.
Increasingly, we are looking at the direct actioning of intelligence, possibly in real-time. The intelligence drives automated decisions within operational business processes, often without a human-in-the-loop, where human supervision and control may be remote or retrospective. A good example of this is dynamic retail pricing, where an algorithm adjusts the prices of goods and services according to some model of supply and demand. In some cases, optimized plans and schedules can be implemented without a human in the loop.
So the data doesn't just flow from the operational systems into the data warehouse, but there is a control flow back into the operational systems. We can call this closed loop intelligence
(If it takes too much time to process the data and generate the action, the action may no longer be appropriate. A few years ago, one of my clients wanted to use transaction data from the data warehouse to generate emails to customers - but with their existing architecture there would have been a 48 hour delay from the transaction to the email, so we needed to find a way to bypass this.)
If you have millions of customers buying hundreds of thousands of products, you need ways of aggregating the data in order to manage the business effectively. Customers can be grouped into segments, products can be grouped into categories, and many organizations use these groupings as a basis for dividing responsibilities between individuals and teams. However, these groupings are typically inflexible and sometimes seem perverse.
For example, in a large supermarket, after failing to find maple syrup next to the honey as I expected, I was told I should find it next to the custard. There may well be a logical reason for this grouping, but this logic was not apparent to me as a customer.
But the fact that maple syrup is in the same product category as custard doesn't just affect the shelf layout, it may also mean that it is automatically included in decisions affecting the custard category and excluded from decisions affecting the honey category. For example, pricing and promotion decisions.
A data-driven business is able to group things dynamically, based on affinity or association, and then allows simple and powerful decisions to be made for this dynamic group, at the right level of aggregation.
Automation can then be used to cascade the action to all affected products, making the necessary price, logistical and other adjustments for each product. This means that a broad plan can be quickly and consistently implemented across thousands of products.
Experimentation and Learning
In a data-driven business, every activity is designed for learning as well as doing. Feedback is used in the cybernetic sense - collecting and interpreting data to control and refine business rules and algorithms.
In a dynamic world, it is necessary to experiment constantly. A supermarket or online business is a permanent laboratory for testing the behaviour of its customers. For example, A/B testing where alternatives are presented to different customers on different occasions to test which one gets the best response. As I mentioned in an earlier post, Netflix declares themselves "addicted" to the methodology of A/B testing.
In a simple controlled experiment, you change one variable and leave everything else the same. But in a complex business world, everything is changing. So you need advanced statistics and machine learning, not only to interpret the data, but also to design experiments that will produce useful data.
A traditional command-and-control organization likes to keep the intelligence and insight in the head office, close to top management. An intelligent organization on the other hand likes to mobilize the intelligence and insight of all its people, and encourage (some) local flexibility (while maintaining global consistency). With advanced data and intelligence tools, power can be driven to the edge of the organization, allowing for different models of delegation and collaboration. For example, retail management may feel able to give greater autonomy to store managers, but only if the systems provide faster feedback and more effective support.
Related to the previous point, data and intelligence can provide clarity and governance to the business, and to a range of other stakeholders. This has ethical as well as regulatory implications.
Among other things, transparent data and intelligence reveal their provenance and derivation. (This isn't the same thing as explanation, but it probably helps.)
Obviously most organizations already have many of the pieces of this, but there are typically major challenges with legacy systems and data - especially master data management. Moving onto the cloud, and adopting advanced integration and robotic automation tools may help with some of these challenges, but it is clearly not the whole story.
Some organizations may be lopsided or disconnected in their use of data and intelligence. They may have very sophisticated analytic systems in some areas, while other areas are comparatively neglected. There can be a tendency to over-value the data and insight you've already got, instead of thinking about the data and insight that you ought to have.
Making an organization more data-driven doesn't always entail a large transformation programme, but it does require a clarity of vision and pragmatic joined-up thinking.
Related posts: Rhyme or Reason: The Logic of Netflix
(June 2017), Setting off towards the Data-Driven Business
Updated 13 September 2019
Some people think that ethical principles only apply to implemented systems, and that experimental projects (trials, proofs of concept, and so on) don't need the same level of transparency and accountability.
Last year, Google employees (as well as US senators from both parties) expressed concern about Google's Dragonfly project, which appeared to collude with the Chinese government in censorship and suppression of human rights. A secondary concern was that Dragonfly was conducted in secrecy, without involving Google's privacy team.
Google's official position (led by CEO Sundar Pinchai) was that Dragonfly was "just an experiment". Jack Poulson, who left Google last year over this issue and has now started a nonprofit organization called Tech Inquiry, has also seen this pattern in other technology projects.
"I spoke to coworkers and they said 'don’t worry, by the time the thing launches, we'll have had a thorough privacy review'. When you do R and D, there's this idea that you can cut corners and have the privacy team fix it later." (via Alex Hern)
A few years ago, Microsoft Research ran an experiment on "emotional eating", which involved four female employees wearing smart bras. "Showing an almost shocking lack of sensitivity for gender stereotyping", wrote Sebastian Anthony.
While I assume that the four subjects willingly volunteered to participate in this experiment, and I hope the privacy of their emotional data was properly protected, it does seem to reflect the same pattern - that you can get away with things in the R and D stage that would be highly problematic in a live product.
Poulson's position is that the engineers working on these projects bear some responsibility for the outcomes, and that they need to see that the ethical principles are respected. He therefore demands transparency to avoid workers being misled. He also notes that if the ethical considerations are deferred to a late stage of a project, with the bulk of the development costs already incurred and many stakeholders now personally invested in the success of the project, the pressure to proceed quickly to launch may be too strong to resist.
Sebastian Anthony, Microsoft’s new smart bra stops you from emotionally overeating
(Extreme Tech, 9 December 2013)
Erin Carroll et al, Food and Mood: Just-in-Time Support for Emotional Eating
(Humaine Association Conference on Affective Computing and Intelligent Interaction, 2013)
Ryan Gallagher, Google’s Secret China Project “Effectively Ended” After Internal Confrontation
(The Intercept, 17 December 2018)
Alex Hern, Google whistleblower launches project to keep tech ethical
(Guardian, 13 July 2019)
Casey Michel, Google’s secret ‘Dragonfly’ project is a major threat to human rights
(Think Progress, 11 Dec 2018)
Iain Thomson, Microsoft researchers build 'smart bra' to stop women's stress eating
(The Register, 6 Dec 2013)
Are algorithms trustworthy, asks @NizanGP
"Many of us routinely - and even blindly - rely on the advice of algorithms in all aspects of our lives, from choosing the fastest route to the airport to deciding how to invest our retirement savings. But should we trust them as much as we do?"
Dr Packin's main point is about the fallibility of algorithms, and the excessive confidence people place in them. @AnnCavoukian
reinforces this point.
But there is another reason to be wary of the advice of the algorithm, summed up by the question: Whom does the algorithm serve?
Because the algorithm is not working for you alone. There are many people trying to get to the airport, and if they all use the same route they may all miss their flights. If the algorithm is any good, it will be advising different people to use different routes. (Most well-planned cities have more than one route to the airport, to avoid a single point of failure.) So how can you trust the algorithm to give you the fastest route? However much you may be paying for the navigation service (either directly, or bundled into the cost of the car/device), someone else may be paying a lot more. For the road less travelled.
The algorithm-makers may also try to monetize the destinations. If a particular road is used for getting to a sports venue as well as the airport, then the two destinations can be invited to bid to get the "best" routes for their customers - or perhaps for themselves. ("Best" may not mean fastest - it could mean the most predictable. And the venue may be ambivalent about this - the more unpredictable the journey, the more people will arrive early to be on the safe side, spreading the load on the services as well as spending more on parking and refreshments.)
In general, the algorithm is juggling the interests of many different stakeholders, and we may assume that this is designed to optimize the commercial returns to the algorithm-makers.
The same is obviously true of investment advice. The best time to buy a stock is just before everyone else buys, and the best time to sell a stock is just after everyone else buys. Which means that there are massive opportunities for unethical behaviour when advising people where / when to invest their retirement savings, and it would be optimistic to assume that the people programming the algorithms are immune from this temptation, or that regulators are able to protect investors properly.
And that's before we start worrying about the algorithms being manipulated by hostile agents ...
So remember the Weasley Doctrine: "Never trust anything that can think for itself if you can't see where it keeps its brain."
Nizan Geslevich Packin, Why Investors Should Be Wary of Automated Advice
(Wall Street Journal, 14 June 2019)
Dozens of drivers get stuck in mud after Google Maps reroutes them into empty field
(ABC7 New York, 26 June 2019) HT @jonerp
Related posts: Towards Chatbot Ethics
(May 2019), Whom does the technology serve?
(May 2019), Robust Against Manipulation
Updated 27 July 2019
More Recent Articles