How many pillars (or components or building blocks) does your data strategy need? I found lots of different answers, from random bloggers to the UK Government.
An anonymous blogger who writes under the pen-name Beautiful Data identifies three pillars of data strategy.
- Data Management - managing data as an asset
- Data Democratization - putting data into the hands of the business
- Data Monetization - driving direct and indirect business benefit
SnapAnalytics identifies People, Process, Data and Technology as its four pillars, and that's a popular approach for many things.
For a different approach, we have four pillars of data strategy from Aleksander Velkoski, the Director of Data Science at the National Association of Realtors.
- Data Literacy
- Data Acquisition and Governance
- Knowledge Mining
- Business Implementation
Olga Lagunova, Chief Data Analytics Officer at Pitney Bowes, identifies four pillars that are roughly similar.
- Business Outcome - knowing what you want to achieve
- Mature Data Ecosystem - including data sourcing and data governance
- Data Science - practices and organization
- Culture that values data-driven decision
In his conversation with her, Anthony Scriffignano, Chief Data Scientist at Dun & Bradstreet, replies that "we have many of those same elements". Perhaps because he is in the business of selling data, Anthony looks at data strategy from two directions, which broadly correspond to Olga's first two pillars.
- Customer-centric - addressing customer needs, solving ever more complex business problems
- Data-centric - data supply chain, including sourcing, quality assurance and governance
The UK National Data Strategy also has four pillars.
- Data Foundations
- Data Skills
- Data Availability
- Responsible Data
A white paper from SAS defines five essential components of a data strategy - Identify, Store, Provision, Process and Govern. But a component isn't a pillar. So the editors of Ingenium magazine have turned these into five pillars - Identify, Store, Provision, Integrate and Govern.
(The SAS paper talks a lot about integration, so the Ingenium modification of the SAS list seems fair.)
For six pillars, we can turn to Cynozure, a UK-based data and analytics strategy consultancy.
- Vision and Value
- People and Culture
- Operating Model
- Technology and Architecture
- Data Governance
Cynozure has also published seven building blocks.
- Data Vision
- Data Sources
- Data Governance and Management
- Data Analysis
- Data Team
- Tech Stack
- Measuring Success
At last we get to the magic number seven, thanks to @EvanLevy.
- The Questions (aka Problems) - the more valuable your question, the more valuable analytics is to the company
- Technical Implementation - he argues that the most valuable datasets require high levels of customization
- The Users - access and control - this links to the Data Democratization pillar mentioned above
- Data Storage and Structure - including data retention
- Data Security - risk and compliance
- Personally Identifiable Information (PII) - privacy
- Visualization and Analysis Needs - flexibility and timeliness
Lawrence of Arabia's autobiography was entitled Seven Pillars of Wisdom, and this is of course a reference to the Bible.
Wisdom has built her house; she has set up its seven pillars. ...
Leave your simple ways and you will live; walk in the way of insight.
Maybe it doesn't matter how many pillars your data strategy has, as long as it gets you walking in the way of insight. (Whatever that means.)
Obviously not everyone is using the pillar metaphor in the same way - there is presumably some difference between a foundation, a pillar and a building block - but there is a lot of commonality here as well, with a widely shared emphasis on business value and people, as well as a few interesting outliers.
While most of the sources listed in this blogpost are fairly brief, the UK National Digital Strategy contains a lot of detail. While it deserves credit for the attention devoted to ethics and accountability in the Responsibility pillar, it is not yet clear to me how it addresses some of the other concerns mentioned in this blogpost. I plan to post a more thorough review in a separate blogpost.
"Beautiful Data", Three Pillars of a Data Strategy (19 Sept ??)
Cynozure, Building A Data Strategy For Business Success (Cynozure, 29 May 2019)
Jason Foster, The Six Pillars of a Data Strategy (Cynozure via YouTube, 19 April 2019)
Ingenium, The 5 Pillars of a Data Strategy (Ingenium Magazine, 24 August 2017)
Evan Levy, 7 Pillars of Data Strategy (HighFive, 1 March 2018)
SAS, The 5 Essential Components of a Data Strategy (SAS 2018)
Anthony Scriffignano and Olga Lagunova, Data Strategy - Key Pillars That Define Success (Dun & Bradstreet via YouTube, 29 March 2018)
UK Government, UK National Data Strategy (Department for Digital, Culture, Media and Sport, 9 September 2020)
Aleksander Velkoski, The Four Pillars of Data and Analytics Strategy (Business Quick, 24 August 2020)
If your #datastrategy involves collecting and harvesting more data, then it makes sense to check this requirement at an early stage of a new project or other initiative, rather than adding data collection as an afterthought.
For requirements such as security and privacy, the not-as-afterthought
heuristic is well established in the practices of security-by-design
. I have also spent some time thinking and writing about technology ethics, under the heading of responsibility-by-design
. In my October 2018 post on Responsibility by Design
, I suggested that all of these could be regarded as instances of a general pattern of X-by-design
, outlining What,Why, When, For Whom, Who, How and How Much for a given concern X.
In this post, I want to look at three instances of the X-by-design pattern that could support your data strategy:
- data collection by design
- data quality by design
- data governance by design
Data Collection by Design
Here's a common scenario. Some engineers in your organization have set up a new product or service or system or resource. This is now fully operational, and appears to be working properly. However, the system is not properly instrumented.
Thought should always be given to the self instrumentation of the prime equipment, i.e. design for test from the outset. Kev Judge
In the past, it was common for a system is instrumented during the test phase, but once the tests are completed, data collection is switched off for performance reasons.
If there is concern that the self instrumentation can add unacceptable processing overheads then why not introduce a system of removing the self instrumentation before delivery? Kev Judge
Not just for operational testing and monitoring but also for business intelligence. And for IBM, this is an essential component of digital advantage:
Digitally reinvented electronics organizations pursue new approaches to products, processes and ecosystem participation. They design products with attention toward the types of information they need to collect to design the right customer experiences. IBM
The point here is that a new system or service needs to have data collection designed in from the start, rather than tacked on later.
Data Quality by Design
The next pitfall I want to talk about is when a new system or service is developed, the data migration / integration is done in a big rush towards the end of the project, and then - surprise, surprise - the data quality isn't good enough.
Particularly relevant when data is being repurposed. During the pandemic, there was a suggestion of using BlueTooth connection strength as a proxy for the distance between two phones, and therefore an indicator of the distance between the owners of the phones. Although this data might have been adequate for statistical analysis, it was not good enough to justify putting a person into quarantine.
Data Governance by Design
Finally, there is the question of the sociotechnical organization and processes needed to manage and support the data - not only data quality but all other aspects of data governance.
The pitfall here is to believe you can sort out the IT plumbing first, leaving the necessary governance and controls to be added in later.
Scott Burnett, Reza Firouzbakht, Cristene Gonzalez-Wertz and Anthony Marshall, Using Data by Design
(IBM Institute for Business Value, 2018)
Cybernetics helps us understand dynamic systems that are driven by a particular type of data. Here are some examples:
- Many economists see markets as essentially driven by price data.
- On the Internet (especially social media) we can see systems that are essentially driven by click data.
- Stan culture, where hardcore fans gang up on critics who fail to give the latest album a perfect score
In a recent interview with Alice Pearson of CRASSH, Professor Will Davies explains the process as follows:
For Hayek, the advantage of the market was that it was a space in which stimulus and response could be in a constant state of interactivity: that prices send out information to people, which they respond to either in the form of consumer decisions or investment decisions or new entrepreneurial strategies.
Davies argued that this is now managed on screens, with traders on Wall Street and elsewhere constantly interacting with (as he says)
flashing numbers that are rising and falling.
The way in which the market is visualized to people, the way it presents itself to people, the extent to which it is visible on a single control panel, is absolutely crucial to someone's ability to play the market effectively.
Davies attributes to cybernetics a particular vision of human agency:
to think of human beings as black boxes which respond to stimuluses in particular ways that can be potentially predicted and controlled. (In market trading, this thought leads naturally to replacing human beings with algorithmic trading.)
Davies then sees this cybernetic vision encapsulated in the British government approach to the COVID-19 pandemic.
What you see now with this idea of Stay Alert ... is a vision of an agent or human being who is constantly responsive and constantly adaptable to their environment, and will alter their behaviour depending on what types of cues are coming in from one moment to the next. ... The ideological vision being presented is of a society in which the rules of everyday conduct are going to be constantly tweaked in response to different types of data, different things that are appearing on the control panels at the Joint Biosecurity Centre.
alert originally comes from an Italian military term all'erta - to the watch. So the slogan Stay Alert implies a visual idea of agency. But as Alice Pearson pointed out, that which is supposed to be the focus of our alertness is invisible. And it is not just the virus itself that is invisible, but (given the frequency of asymptomatic carriers) which people are infectious and should be avoided.
So what visual or other signals is the Government expecting us to be alert to? If we can't watch out for symptoms, perhaps we are expected instead to watch out for significant shifts in the data - ambiguous clues about the effectiveness of masks or the necessity of quarantine. Or perhaps significant shifts in the rules.
Most of us only see a small fraction of the available data - Stafford Beer's term for this is attenuation, and Alice Pearson referred to hyper-attenuation. So we seem to be faced with a choice between on the one hand a shifting set of rules based on the official interpretation of the data - assuming that the powers-that-be have a richer set of data than we do, and a more sophisticated set of tools for managing the data - and on the other hand an increasingly strident set of activists encouraging people to rebel against the official rules, essentially setting up a rival set of norms in which for example mask-wearing is seen as a sign of capitulation to a socialist regime run by Bill Gates, or whatever.
Later in the interview, and also in his New Statesman article, Davies talks about a shifting notion of rules, from a binding contract to mere behavioural nudges.
Rules morph into algorithms, ever-more complex sets of instructions, built around an if/then logic. By collecting more and more data, and running more and more behavioural tests, it should in principle be possible to steer behaviour in the desired direction. ... The government has stumbled into a sort of clumsy algorithmic mentality. ... There is a logic driving all this, but it is one only comprehensible to the data analyst and modeller, while seeming deeply weird to the rest of us. ... To the algorithmic mind, there is no such thing as rule-breaking, only unpredicted behaviour.
One of the things that differentiates the British government from more accomplished practitioners of data-driven biopower (such as Facebook and WeChat) is the apparent lack of fast and effective feedback loops. If what the British government is practising counts as cybernetics at all, it seems to be a very primitive and broken version of first-order cybernetics.
When Norbert Wiener introduced the term cybernetics over seventy years ago, describing thinking as a kind of information processing and people as information processing organisms, this was a long way from simple behaviourism. Instead, he emphasized learning and creativity, and insisted on
the liberty of each human being to develop in his freedom the full measure of the human possibilities embodied in him.
In a talk on the entanglements of bodies and technologies, Lucy Suchman draws on an article by Geoff Bowker to describe the universal aspirations of cybernetics.
Cyberneticians declared a new age in which Darwin's placement of man as one among the talks about how animals would now be followed by cybernetics' placement of man as one among the machines.
However, as Suchman reminds us
Norbert Wiener himself paid very careful attention to questions of labour, and actually cautioned against the too-broad application of models that were designed in relation to physical or computational systems to the social world.
Even if sometimes seeming outnumbered, there have always been some within the cybernetics community who are concerned about epistemology and ethics. Hence second-order (or even third-order) cybernetics.
Geoffrey Bowker, How to be universal: some cybernetic strategies, 1943-1970 (Social Studies of Science 23, 1993) pp 107-127
Some good down-to-earth
points from #ASPC20 @airpowerassn 's Air and Space Power Conference
earlier this month. Although the material was aimed at a defence audience, much of the discussion is equally relevant to civilian and commercial organizations interested in information superiority
(US) or information advantage
Professor Dame Angela Mclean, who is the Chief Scientific Advisor to the MOD, defined information advantage
The credible advantage gained through the continuous, decisive and resilient employment of information and information systems. It involves exploiting information of all kinds to improve every aspect of operations: understanding, decision-making, execution, assessment and resilience.
She noted the temptation for the strategy to jump straight to technology (technology push
); the correct approach is to set out ambitious, enduring capability outcomes (capability pull
), although this may be harder to communicate. Nevertheless, technology push may make sense in those areas where technologies could contribute to multiple outcomes.
She also insisted that it was not enough just to have good information, it was also necessary to use this information effectively, and she called for cultural change
to drive improved evidence-based decision-making. (This chimes with what I've been arguing myself, including the need for intelligence to be actioned, not just actionable.)
In his discussion of multi-domain integration, General Sir Patrick Sanders reinforced some of the same points.
Superiority in information (is) critical to success
We are not able to capitalise on the vast amounts of data our platforms can deliver us, as they are not able to share, swap or integrate data at a speed that generates tempo and advantage
- (we need)
Faster and better decision making, rooted in deeper understanding from all sources and aided by data analytics and supporting technologies
See my previous post on Developing Data Strategy
Professor Dame Angela Mclean, Orienting Defence Research to anticipate and react to the challenges of a future information-dominated operational environment (Video
General Sir Patrick Sanders, Cohering Joint Forces to deliver Multi Domain Integration (Air and Space Power Conference, 15 July 2020) (Video
, Official Transcript
For the full programme, see https://www.airpower.org.uk/air-space-power-conference-2020/programme/
The conference opened with a very interesting presentation by Peter Thomas (Prudential Regulation Authority, part of the Bank of England). Some key takeaways:
The Bank of England is a
fairly old-fashioned institution. The data programme was as much a cultural shift as a technology shift, and this was reflected by a change in the language – from
data management to
Challenges: improve the cadence of situation awareness, sense-making and decision-making.
One of Peter's challenges was to wean the business off Excel. The idea was to get data straight into Tableau, bypassing Excel. Peter referred to this as
straight-through processing, and said this was the
biggest bang for the buck.
Given the nature of his organization, the link between data governance and decision governance is particularly important. Peter described making governance more effective/efficient by reducing the number of separate governance bodies, and outlined a stepwise approach for persuading people in the business to accept data ownership:
- You are responsible for your decisions
- You are responsible for your interpretation of the data used in your decisions
- You are responsible for your requests and requirements for data.
Some decisions need to be taken very quickly, in crisis management mode. (This is a particular characteristic of a regulatory organization, but also relevant to anyone dealing with COVID-19.) Then if they can cut through the procrastination in such situations, this should create a precedent for doing things more quickly in Business-As-Usual mode.
Finally, Peter reported some tension between two camps – those who want data and decision management to be managed according to strict rules, and those who want the freedom to experiment. Enterprise-wide innovation needs to find a way to reconcile these camps.
Plenty more insights in the video, including the Q&A at the end - well worth watching.
More Recent Articles