Are you realising the full potential of Microsoft SQL Server?

Thursday, February 23, 2017

One of our Sydney consultants, Dawie Kruger, shares some of his thoughts on the features available in Microsoft SQL Server that he feels are not being fully utilised by users. 

I was privileged enough to attend Microsoft Ignite 2017 this year on the Gold Coast. There was considerable talk about the new features that will be offered in SQL Server 2016 Service Pack 1.  If you are not familiar with this as yet Microsoft is releasing over 15 Enterprise Edition features in SQL Server Standard and Express editions. These features include In-Memory OLTP, Partitioning, Compression and also additional security features like Always Encrypted. 



These are some great new features and like me, my fellow attendees agreed this is a fantastic step forward. 

One of the sessions I attended was Explore In-Memory OLTP Architectures and Customer Case Studies, presented by Sunil Agarwal (Principal Program Manager at Microsoft Corporation). One of the first questions in this session was “Who is currently using In-Memory or Compression in SQL Server?” I was surprised by how few hands were raised.

This got me thinking, “How many companies are running the latest versions of software but they are not utilising it to its full potential?” At one of my previous clients the CIO embarked on a strategy to identify how the existing applications could be used to their full potential as an alternative to upgrading to the “Latest and Greatest”. We found that in most cases that features and capabilities were not being utilised, and once configured properly these had a significant impact on performance and operational support so as to reduce the urgency of upgrading the applications.

In other cases, I have seen how applying features like Compression or Partitioning on tables in Microsoft SQL Server not only had a performance and stability improvement but also won back gigabytes of hard disk space that is still a hot commodity on today’s VM Hosts.

In this blog I will focus on two technologies, namely Compression and In-Memory OLTP in the Microsoft SQL Server Database that I feel are underutilised in the DW/BI space.

Compression

Table compression in SQL Server has been available in two flavours since SQL Server 2012 and in my personal experience is always something to consider. The example below is from a site where compression resulted in an average 38% reduction in disc space required by the solution.

We did see a slight increase in the build time of some of the compressed tables during the daily ETL run however the improved query speed more than compensated for this with cube processing times decreasing by 50%.

In-Memory OLTP

In-Memory OLTP is a technology that allows you to store database tables in server memory (RAM). A demonstration at Ignite highlighted the improvements made in SQL Server 2016 to in memory tables and the benefits to be gained if used correctly. In SQL Server 2016 there are 3 different durability levels (Fail Safe) for In-Memory tables:

  • Full Durability – Data is stored in memory with every transaction committed to disk. (Full Redundancy)

  • Delayed Durability – Data stored in memory with a delay in writing to disk.

  • Non-Durable - Data stored in memory only.


These options allow for multiple different uses in handling and processing data. For example, instead of adding pressure to TempDBb with Temp Tables why not create a Non-Durable in-memory table. Or why not land data in an in-memory table allowing you to load multiple sources at the same time into a single table without locking or latching. 

To see the full benefits offered by the use of In-Memory OLTP there are some development practices to take into consideration. At present you can get substantially higher performance gains when you use In-Memory compiled stored procedures to load your data into In-Memory tables. On a test bench we saw record inserts jump from 3,000 to 8,000 per second by simply converting the SQL table to In-Memory, then again jump to 12,000 inserts per second by compiling the stored procedures.

One aspect that is truly exciting is that Table Compression and In-Memory Tables are compatible technologies and can be used in conjunction with each other and I strongly encourage you to explore these more if you aren’t already leveraging this functionality.


A disclaimer here is that the figures seen in this blog are indicative examples only and are dependent on the type and volume of data used. Just as you would investigate and test practices like indexing strategies, these features should be properly tested before being implemented in production systems.







Microsoft Ignite

Friday, February 17, 2017

Two of our consultants, Michael Betterton and Dawie Kruger, are attending Microsoft Ignite on the Gold Coast this week, and in this blog they share some of the insights from the event.

The opening key note delivered by Scott Guthrie, the VP of cloud and Enterprise for Microsoft had a strong cloud message to the keynote with a key takeaway being Microsoft's focus on a hybrid cloud/on-premise solution.  Themes of cloud, analytics, internet of things and new infrastructure approaches prevailed. One of the case studies discussed was Rolls Royce, who have an IoT solution on all of the airplane engines they make that stream data to Azure and uses Azure ML to predict engine failure, displaying results in PowerBI. 

The keynote finished with a demo using Azure Cognitive services. This demo had two components, the first was real time facial recognition. The second component was the server-less cognitive services offering which is a lego-brick style drag-and-drop code editor. In the demo the presenter built in sentiment analysis of tweets that fed into a Dynamics instance in Azure, creating cases for negative tweets, all without a single VM provisioning. 

Also discussed was the vision for SQL Server & beyond presented by Lindsay Allen – the program manager for SQL.  The approach is to modernise business apps by moving them to the cloud either through IaaS (infrastructure as a service) or through PaaS (platform as a service). The PaaS offering was hosting SQL DB’s in the cloud. This is a new offering by MS being released soon that is the “CL Series Databases” PaaS. This lets you easily move on-premise databases to the cloud with little effort into a single elastic cluster. This cluster can scale to 150pb.  The smart database features in Azure have also been updated with the ability to suggest index creation/drops, schema issues and parameterisation of queries for PaaS DB’s. 

Understanding Windows Containers - using containers you are able to create a base OS or Software layer environment for developers that can be deployed to UAT and Production servers and they assist to eliminate discrepancies that caused application to crash or “work differently” in prod. It takes the concept of virtualisation to the software layer instead of the server layer. Some benchmarks saw new containers being spun up in less than 3 seconds using PowerShell scripts. Containers are not meant to be used for databases as they are blown away and have very small footprints however it does make for some interesting concepts for app development and deployment. More on container can be read here.

There were a few sneak peeks of upcoming features in SQL coming in June. In no particular order they are:

  • High Availability, Disaster Recovery & Availability Groups will be available across OS (window + Linux), with any combination of OS’s.

  • Clustered Columnstore indexes available from LOB data types.

  • The ability to pause and resume online index rebuilds

  • Adding of machine learning to the query optimizer to attempt to learn from its mistakes. This is a first step only and the key implementation here is Table Valued Functions and assessing the size of the result set. 

  • Python Integration - Apparently R wasn’t enough for MS and soon python will have full integration with the DB engine.

  • Graph Data Structures – this involves full support for CRUD of graphs & db-engine integration. This had a very short demo only but the query of the graph structure was written into normal T-SQL with its own syntax. It will be interesting to see applications of this in the future. 

Look out for the next series of blogs that go into more detail on some of these topics.

JISC and HEPI in the UK publish a Headline News article referencing the success of Altis' work on Learner Analytics in Australia.

Thursday, February 09, 2017

JISC - a not-for-profit organisation that encourages Higher education institutes to adopt digital services and solutions, and HEPI -  a UK think tank devoted to higher education, published a news article earlier this week entitled 'Reducing drop-out rates with Technology: Learning from Australia.' 

This article reviewed best practice approaches across the world, and referenced the work that Altis delivered with the University of New England (UNE) as a success story. UNE managed to reduce its drop out rates from 18% to 12% using its Early Alert system. This system was designed by UNE and delivered by Altis Consulting - providing a clear case study for the success of Learner Analytics. This system was originally implemented in 2011 and has been providing benefit since. The work undertaken by UNE achieved a number of awards, including an Australian Learning and Teaching Award, an Australian Universities Quality Agency (AUQA) Commendation and an Australian Learning and Teaching Council citation.  Altis were very proud to have played our small part in helping the University of New England win these awards.

Building on this experience Altis have developed a new Learner Analytics kick start, called BloomThrive, in association with the University of London - specifically tailored for the UK market - to help UK Universities gain similar results. 

For any enquiries regarding Altis, BloomThrive or Learner Analytics in the UK please contact Peter Hopwood or Daley Davis on +44 208 133 5095 or email: connect@altisglobal.co.uk



Australian Utility Week - Key Takeaways

Tuesday, December 06, 2016

Our Utility Practice Lead, Rhubesh Gommiah, attended The Australian Utility Week 2016 conference held in Sydney last week. 

The conference focused on the theme of ‘Building the Digital Utility’, in this post Rhubesh shares some of his key takeaways from this event.

This was my third consecutive attendance at this event and this year Matt Feltham from our NSW/QLD region co-presented with Ivan Beirne from Unitywater as part of the ‘Analytics in Action’ track. Their presentation focused on the tangible benefits delivered by the business intelligence solution that Altis has assisted Unitywater in implementing. Ivan gave some great examples of how the solution is being used in his area to help improve asset performance and utilisation and better target asset investment.

As in the previous years, there were multiple tracks available as part of the two-day conference programme. The sessions that I ended up attending covered a broad range of topics from the Internet of Things (IOT) to water smart meters and the decentralisation of the energy sector. Some of the presentations that caught my attention included:

  • A long-term Altis client, TransGrid, demonstrating how they are experimenting with Augmented Reality technologies to facilitate the work of field employees and improve their safety.

  • The CEO of another Altis client, Vector, describing how his organisation is embracing the change in the energy sector and adopting new technologies like solar, battery storage and electric vehicles instead of being disrupted by these technologies.

  • A university researcher highlighting the future potential of water smart meters from a customer and smart home perspective. Interestingly, some of the use cases highlighted such as leak detection and appliance fingerprinting are already being trialled by one of our Victorian water clients as part of a project that Altis is currently involved in.

There were also a number of start-ups present at the event, with most of these new entrants in the sector being technology companies which have been capitalising on opportunities around IOT and data analytics. For example, a new electricity retailer in NSW is offering an integrated solution of advanced pool automation services and electricity to swimming pool owners and operators. Their cloud-based solution integrates data from installed sensors with AEMO and weather data to optimise the operation of pools and reduce electricity costs for their customers. Another start-up is building a database of addresses with rooftop solar after having developed a machine learning algorithm to scan satellite images and identify rooftops with solar panels.

For traditional utilities, embarking on the digital journey clearly offers a range of opportunities for innovation. In particular, by using analytics to capitalise on the ever-increasing amounts of data that they are collecting, they can re-invent the way they engage with their customers and manage their assets. 

To find out more about how the changing landscape of data and analytics can benefit your utility organisation, please contact us.


AWS re:Invent

Tuesday, December 06, 2016

This week, Guillaume Jaudouin our AWS Practice Leads takes us through some of the announcements made by AWS during the re:Invent week.


As in every year, the AWS re:Invent week came with a host of new announcements, of which you can get the complete list here.

At Altis Consulting we focus on data and what we can do with it, so don’t expect full hardware specs and comparison of the new EC2 instance families that have just been released.  If you are still keen to talk gear, drop me an email and we can chat when no one is watching, instead, in this blog we will talk about 2 new services arriving in the Big Data and Analytics arena. 

The first one, AWS Athena is already available in a few regions in the US. This service allows direct query of data stored in S3.  Queries are written in standard SQL and supported file formats include CSV, Json and Parquet. Athena integrates with AWS QuickSight, or any JDBC compatible BI tool, for visualisation of the output. 

In conversations about Data Lake, customers often have the following question: “Once I’ve got all my data stored in my Data Lake on S3, how do I access it?”  The answer often is that further processing needs to be done on EMR or that data can be pushed to RedShift and then analysed with a data discovery tool. Well, AWS Athena will certainly change this conversation. 

Of course there is still a place for Data Analytics heavy weights like Redshift and EMR. Athena certainly won’t replace any of these two. They rather work together in a bi-modal way with Mode 1 being addressed by EMR and RedShift and Mode 2 supported by Athena for ad-hoc querying of the data. If you are unfamiliar with bi-modal BI, I suggest you check out point 6 of this blog post by my colleague James Black. 

(source: twitter.com/AWSreinvent) 

The other service that was announced is AWS Glue. The service is not available yet, but it is presented as ETL as a service. There is not a lot of information available at the moment, but it seems that ETL processes can be created without the need of scripting, although generated scripts can be modified if needed. As expected, it will integrate with most AWS Data Stores (S3, RDS, RedShift) and crawl them to generate a catalogue of data available in your account. 

(source: AWS.AMAZON.COM) 

These are exciting times as we can now envisage a complete fully-managed bi-modal big data and analytics platform on AWS – all of it without the need to spin up a single EC2 instance. 
One last thing: you can now move loads of data to AWS using a truck!

(source: AWS.AMAZON.COM) 

Fixed Price Agile Scrum for a Data Warehousing Project - Can it work?

Wednesday, November 30, 2016

Alok Joshi from our Melbourne office shares his insights on whether Agile Scrum can work on Fixed Price Data Warehouse projects.

I've often heard the argument that Agile Scrum cannot fundamentally work with a fixed price project due to the nature of the predetermined scope of work. Aside from the price, what are you actually fixing? Is it requirements or outcomes? Outcomes are subject to the interpretation and skill of the person / team tasked to complete the project, whereas requirements can be agreed and clarified. For example, if the requirement from accounting is to have a quarterly report that shows month end expenses by each department, is the outcome then to show the information in a bar chart? Tabular structure? Line graph? This can be clarified with accounting, but at the end of the day it is up to the team executing to provide the capability. This especially goes for more complex scenarios requiring specialised skills like dimensionally modelling a data warehouse. A requirement may be to model the data to optimise for scalability and performance - there could be a dozen ways to achieve this outcome when modelling a data warehouse. So let's talk about what we should be actually fixing here: The requirements to be met, and how the core components of Agile Scrum can be adapted.
Requirements and Product Backlog
The product backlog in a fixed price engagement is your scope of work and in Agile Scrum, this is your master list of requirements to be prioritised. What this means is that for fixed price, all items in the product backlog need to be completed by the end of the project. This is in contrast to a scrum backlog, where the highest priority tasks are “picked” at the start of each sprint. Therefore, all components of the development lifecycle are captured here in the form of tasks (or stories if applicable) from requirements, analysis and design, to build, test and deploy. This is your work break down structure and your map to follow for the project.
 
In Agile Scrum, Product Owners have the opportunity to revisit the product backlog anytime and reprioritise the tasks that they want the team to tackle in the next sprint. So why not in a fixed price project?  The only difference is if new items are brought in to the backlog and prioritised or removed, this equates to a change in scope and the relevant change control processes would be engaged i.e. change request and subsequent agreed re-scoping. This shouldn’t be seen in a negative light, but as an essential step in ensuring customer requirements are met.
Planning and Estimation
In Agile Scrum projects, sprint planning is where the team would come together and decide as a team, what is achievable for the duration of the sprint, with a sense of priority and guidance from the product owner. The team picks tasks from the top of the product backlog and estimate the effort required to complete it.
 
In a fixed price project however, the sprint goals are predetermined, the backlog priorities are already set, and the tasks to be completed are already estimated to be feasible for the sprint duration. So why hold a sprint plan? The simple answer is to create the opportunity to challenge the timeline, the work being performed and the outcomes the sprint is looking to achieve. Are the goals we set 2 months ago for this sprint still relevant? Are there more pressing items to address? It is better to answer these questions intermittently than at the end!
Showcases and Retrospectives
Showcases enable the team to demonstrate the capability that has been delivered in the sprint. This does not need to change in a fixed price project. The team can organise showcases so that whatever part of the project they are working on can be demonstrated to stakeholders. It is better to get feedback (good or bad) early so you know what works and what needs to be addressed.
 
This is where retrospectives in Agile makes the process of continual improvement within the team much more achievable. It is important for the team to reflect on what went well and what could have been done better. Holding these sessions is just as important in a fixed price engagement. If the processes or products being developed are found to be not working, it provides the opportunity once more for change control processes to kick-in and allow the team to continue down the right path.
 
At the end of the day, it's about expectations that are agreed upon and being able to manage this as flexibly as possible when something does come up (and it will!). Having a framework to help keep everyone (delivery team, project manager, customers/stakeholders) on board and aligned makes for much happier outcome.

What do you think about applying Agile Scrum to Fixed Price Data Warehouse projects? Have you used this approach? Share your thoughts.

Six takeaways from the TDWI Conference

Wednesday, October 19, 2016

James Black from our Sydney office recently attended the TDWI conference "Analytics across the Enterprise" earlier this month in San Diego, California. Here he shares his key takeaways from the conference.

The main theme of the conference was developing insight-driven organisations that succeed in using analytics to create real business impact. 

 

As with all TDWI conferences, there was a myriad of courses on offer across a variety of learning tracks, from 'Getting Started with Analytics', to 'Managing Analytical Data' and 'Visualising and Communicating using Data' through to a first ever TDWI Data Scientist boot camp. 

Attending the conference was a rewarding experience - getting to hear and interact with the industry thought leaders who present the courses; and share experiences with my fellow attendees from the front-line of Analytics and Data Warehousing. For me, the overarching theme of the courses I attended was the challenge for traditional DW/BI practitioners to adapt our structured methods and processes to the more fluid and changeable landscape of the Big Data world.

Here are my six key takeaways from the conference:

1. ‘Traditional’ DW/BI is still important, but it’s no longer enough

Data Warehousing and Business Intelligence as we know it – reports, dashboards, cubes, ETL, 3NF, Star Schemas – has been around for decades and is to some extent overshadowed these days by all the hip, exciting, new-fangled stuff going on related to Big Data. But it remains hugely important to how most organisations make business decisions.

However, many businesses are moving beyond conventional BI to gain a competitive edge not merely by seeking to answer the ‘who’, ‘what’ and ‘when’ of DW/BI; but the analytical questions of ‘what if..’, ‘what should..’ and ‘what will happen in the future’. This is the power of Advanced Analytics – analytics that predict what is likely to happen and future behaviours; and analytics that prescribe future courses of action.

2. The enterprise is no longer the centre of the data universe

The question of what data of value sits outside of an organisation’s core transactional and customer management systems is of course nothing new. But those halcyon days, where the biggest concern for Data Architects was how to get all that data residing in spreadmarts and Access databases into the Data Warehouse are over. 

Data that is required for Analytics is everywhere – inside and outside of the enterprise. The data internal to the enterprise remains important, but the data external to the enterprise is becoming just as important – web data, social media, open data and government datasets, text and sensor data, system logs and machine generated information. 

As well as the location of the data, the type of data available for analysis has also changed – not just greater volumes at a lower latency but non-relational, inconsistent and without metadata. Data that doesn’t fit neatly into the structured, architected models of our usual relational databases - streaming and telemetry data, social media data, JSON, XML, textual data. New types of databases are being used to store this non-relational information – Key Value, Document and Graph databases – the ‘No SQL’ databases.

3. A new paradigm for architecting and modelling BI solutions

So the data is not always relational, we won’t always know the exact format before we receive it, and the format may change – resulting in a shift in how we architect and model future DW/BI solutions.

Does unstructured data make data modelling impractical? Does NoSQL imply no data modelling? Do E-R and star schema models still matter? The answer to these questions is that data modelling is still an important process; data modelling for relational structures is not going away; but the data modelling process must change to keep pace with the rapidly evolving world of data.

In traditional BI, we model before storing data. For these new data sources, we can no longer model and architect all the data in advance - we will have to store the data first and then model it. This is the ‘Schema on Read’ paradigm versus the ‘Schema on Write’ process of traditional Data Warehousing.

The process of architecting DW/BI solutions also needs to become more adaptive. This means no ‘one size fits all’ architecture with rigid ‘enterprise’ standards but allowing greater access to the data at earlier stages of its processing i.e. rather than waiting for the data to be transformed and loaded to the DW, users get access to it in its rawer forms. 

While Data Architecture has usually involved control and centralisation, the requirements of Data Scientists are for empowerment and decentralisation. Architecture and governance remain hugely important but some happy medium is required in relation to rigour and security, so for example, sensitive customer data isn’t just dumped in to the data lake for anyone to access.

4. Big Data – moving beyond the hype and into the mainstream

Big Data has justified the hype and it was clear at the conference that many attendees were from organisations that were already utilising the data, technologies and techniques of Big Data. Big Data is not the goal, nor is it the question – it’s not data for the sake of data. Instead, Big Data enables business strategies through Analytics. 

However, to fully grasp the opportunities that Big Data may provide, businesses are moving beyond the 3 original “V”s of Big Data – Volume, Velocity and Variety. There are now additional “V”s that need to be taken into account:

  • Veracity – the data must be trustworthy and credible.

  • Visualisation – after processing the data, the importance of visualisation is that it makes the data comprehensible. This also doesn’t mean simply graphs and charts but visualisations that enable data discovery and exploration.

  • Value - Data in and itself has no intrinsic value. The business must be able to use the data for Advanced Analytics in order to get value from the data.

 

5. Hadoop is still the elephant in the room

A number of the TDWI courses focused on the dizzying array of technologies that have sprung up to enable the delivery of Big Data solutions. It is clear that Hadoop (and its huge ecosystem of components) is still the major platform for Big Data applications across both analytical and operational implementations. This is due to its storage and processing capabilities, high performance and availability, and competitive pricing.

Rick van der Lans, in his course on New Data Storage Technologies, stated that there are a number of barriers to widespread Hadoop usage - the complexity of the APIs for accessing data on Hadoop which requires programming skills (making Hadoop unsuitable for non-technical users); low productivity of developing in Hadoop APIs; and limited tool support – as many reporting and analytics don’t support interfaces to Hadoop via MapReduce and HBase.  

This has led to the growth of ‘SQL-on-Hadoop’ engines which, according to Rick, offer the performance and scalability of Hadoop with a programming interface – SQL – which is known to non-technical users, requires a lot less coding and is supported by many more reporting and analytical tools. The first SQL-on-Hadoop engine was Hive and since then the demand is growing as is the choice of engines – Drill, Impala, Presto, Jethro, Splice and a myriad of other applications.

Whether its applications on Hadoop; alternatives to Hadoop such as Spark; high performing SQL-based appliances like Greenplum, HANA or Netezza; or the large number of NoSQL products out there - the choice is seemingly endless and confusing. The recommendation coming out of the conference is to do your research into the technologies and undertake Proof of Concepts before settling on a Big Data technology. It’s very much a case of ‘caveat emptor’ – buyer beware – you don’t just want an application with a funky name – it’s got to meet the operational, or analytical needs (or both) of your organisation, and there has to be resources available who have the skills to utilise it.  

 

6. Stop being the Department of No and become the Department of Yes

So how does a BI department or team evolve to cope with the brave new world of Analytics? One of the complaints from analysts and data scientists is that the traditional method of BI delivery takes too long. In a world where we don’t always know the requirements, the model, or the solution until we have loaded the data, how does the BI team prevent itself from being the bottleneck?

‘Stop being the Department of No and become the Department of Yes’ was a great quote I heard from Mark Madsen, one of the TDWI course instructors. By giving people a place for their ungoverned data and providing access to everything, even in its raw form - the sandbox approach for quick data load and analytics - we become adaptable to change and disruption. This is the ‘Bi-Modal’ approach:

  • Mode 1 is the traditional, well governed, secure, end-to-end DW/BI solution. This is where the repeatable processes that have standards, rules and models applied will exist and these projects will still offer huge value.

  • Mode 2 is the ‘sandbox approach’ – an area where the data can be loaded quickly to let the data scientists get to work and the business play with the data and decide if there is merit/value in transitioning to Mode 1 

This ‘two-stream’ approach to development and governance is certainly a theme coming through in the various courses. As well as allowing rapid access to the data for quick wins, this also allows for prototyping, PoCs and a ‘Fail Early’ approach for deciding on the best Big Data technology for an organisation.

Do you rely on TM1 for your financial budgeting and planning or reporting?

Friday, October 14, 2016

Do you know that on 24th November 2016 all TM1 servers will stop working? This is due to the SSL certificates expiring on this day.  In order for you to have TM1 up and running you will need to ensure that new SSL certificates are in place.


What are your options:

Disable SSL (insecure)
Switch to V2 certificates (recommended) 
Wait to see if IBM releases a fix pack for 10.2
Upgrade TM1 to 10.2.2 install FP6 and then SSL certificate fix (time consuming)

How can we help?

Altis Managed Services have developed a solution based on Option 2, where we will remotely manage the loading of V2 certificates through your development, test and production environments.  For just $5,000 we can remove your risk and ensure TM1 is working when you need it.

Moving forward, we can manage your entire TM1 environment for as little as $5,000 per month and give you absolute confidence and peace of mind in the safety and management of your data.

Want to learn more?

Contact us at Connect@altis.com.au or call us at 02 9211 1522.



Spreadsheets versus a Data Warehouse – the two sides of the Analysis coin

Thursday, September 29, 2016

Ash van der Spuy from our Sydney office shares his thoughts on how IT and business can work more closely to bridge the gap of familiar spreadsheet analysis vs reporting from a data warehouse for Financial Analysts.
 

Spreadsheets have been the mainstay of the Financial Analyst for decades, and its blank canvas and powerful calculation functions ensures that it remains the “go to” tool for both quick and complicated calculations. There are no tools as efficient for mocking up a quick analysis as a spreadsheet and it is important to keep this in mind, because there is a tendency is to criticise Excel so as to highlight the importance of formal reporting and analysis. While spreadsheets cannot do everything, similarly data warehouses cannot always meet the demands of the Financial Analysts; for proper analytics these functions have to meet each other half way. 

1. The IT side of the coin

Every tool has its purpose and spreadsheets have their place. The same is to be said for formal reporting and analysis tools such as the ones developed by Tableau, Qlik, Microsoft, IBM, Oracle, SAS and SAP. These reporting and analysis tools however work best with well-structured data, and even better if this data is in a database in a proper data model. There are tools which are improving data exploration using diverse data sets, but their functionality is limited. When it comes to complicated calculations using snippets of information, often from temporary or ad hoc data sources to answer a specific pressing question; the spreadsheet rules supreme. 

This is the side of the coin that the Business Intelligence or Information Management professionals need to support. Some questions need immediate answers and pose a question a data warehouse (DW) may not accommodate, but this doesn’t mean that the DW should just leave the Analysts to blunder ahead. 


2. The Business side of the coin

Business and Financial Analysts in particular, can easily reach a place where they view the data warehouse as slow and cumbersome. They have pressing questions, and can’t wait until tomorrow or next week to answer them and this is when they often resort to entering the data into a spreadsheet.

But it is exactly this haste that leads to keying errors, mistakes in formulas or using the wrong data. Worse than that, is using the correct data without knowing that it is of bad quality - Just because data comes from a financial system or an ERP, its quality is not guaranteed. 


Case and Point/ The perfect storm

I have seen that the best analytics and reporting comes from organisations where the Data Warehouse and Business work together; and Analysts and IT professionals often forget that. The goal for a data warehouse should be to ensure that for those questions that must be answered in a spread sheet, the largest proportion of the data must come from good quality, trusted sources which the data warehouse has prepared and approved. When that doesn’t happen, the perfect storm is unavoidable.

The perfect storm is the scenario where a spread sheet contains both good data from a data warehouse and bad quality data and incorrect formulae input by analysts, which is then used by Management to make decisions. Since some of the data is from the data warehouse, the results are taken with high regard and decisions are made with confidence, even when affecting people’s lives and jobs.
We have seen this happen to companies big and small, both locally and globally. Take the example of Barclay’s and their overpayment of hundreds of millions for subprime debt. During the GFC, hidden columns in a spread sheet led to this their perfect storm. http://www.computerworld.com/s/article/9117143/Excel_error_leaves_Barclays_with_more_Lehman_assets_than_it_bargained_for 

Another example has come from an economics study which was strategically used, largely by Europe and America to deal with the GFC. Upon further analysis, academics found the study was based on results from a spreadsheet containing errors and the resulting fallout from this was international.
http://theconversation.com/the-reinhart-rogoff-error-or-how-not-to-excel-at-economics-13646

Recommendations for a way forward

Here are my 6 key observations and recommendations for Business and IT to meet in the middle:

  • Analysts have to keep inmind that the formal systems and policies employed by the DW are there to

    • A. Supply automatic results that accelerate decision making which is also more efficient in the future.

    • B. Ensure the data is of a high quality, and can be trusted for decision making.

  • Once Analysts realise they are answering the same ad hoc repeatedly, and/or major decisions are being made based on the answers, the help of the DW should be sought. This is to automate the results and to ensure that the data is cleansed and verified if necessary.

  • Cleansing and verifying the data will give further weight to the results and allow Analysts to focus on the next set of ad hoc questions.

  • The automation process will also deconstruct the formulae in the analysis, providing another level of confidence to the results.

  • IT professionals have to keep in mind that business needs are ever changing, and there is no way that they can keep up with every requirement. Instead the focus should be to grow that middle ground, and feed it with more and more trusted data.

  • Alternatives to the traditional DW architectures should be assessed for fit, as a sandbox or multi-modal (this could include Lambda architecture) data platform can go a long way towards lowering the latency requirements that drive users away from the DW.

Spreadsheets, business intelligence software, analysts and specialists all have their place, they key is to understand the functions of each, and promote synergy through collaboration, and thereby better assist the broader company.

In future blogs we will explore how sandboxes and multi-modal, including Lambda architecture are changing the Information Management landscape.

 

 

Power BI – Tips for a successful rollout

Thursday, August 25, 2016

The Power BI toolset from Microsoft is changing the game in the data visualisation space. With major monthly releases building on the functionality and rapidly incorporating new features, combined with its competitive pricing we are seeing many clients dipping their toes into the world of Power BI.

Altis surveyed some of our Power BI users to collect feedback on their experience of rolling out the tool and to share some of the lessons they learnt along the way.


Make sure “In the Cloud” is considered

Not everyone is aware that PowerBI is a fully Cloud based solution. While PowerBI Desktop offers local development, the final published reports and dashboards are pushed into the cloud on the powerbi.microsoft.com website.  This means ensuring you engage with the right people in your organisation for architecture, data security and infrastructure in advance so that everyone comes on the journey. Validate your security requirements early on and continue to validate regularly with key users after the initial roll out.

Microsoft are working on an on-premise version which according to their roadmap will first be an enhancement to Datazen.


Be aware that visualisation tools may be completely new to some users

The competitive pricing of Power BI Pro opens it up to a larger potential audience than compared to traditional business intelligence tools. Self-service may be completely new and users may not be familiar with some of the underlying data modelling concepts. We recommend setting up a Power BI user group and use creative ways to encourage people to join and participate. For example, have a “report of the month” competition; promote knowledge sharing to bring everyone up to the same level of understanding and make it easy to access current documentation and FAQs. https://powerbi.microsoft.com/en-us/support/  

There are differences between the Free and Pro versions which is worth becoming familiar with so as not to be limited down the line.


Governance continues to be important in the self-service world

As people can build their own reporting on a mix of data sources, you may have varying degrees of trust around the different reports being published. A great suggestion from one client is to put in place a folder structure that attaches a level of trust or confidence against reports and dashboards. For example:

  • Gold – Built on trusted data sources with reporting that has been system and UAT tested and approved

  • Silver – Reporting is on trusted sources and individual users have tested, but it hasn’t gone through the full testing and approval cycle

  • Bronze – Reports are built on sources that are not yet validated and approved

Have a central approval team and have clear instructions that document the approval process. Leverage the Power BI user group to contact users with “silver” of “bronze” reports and encourage them to move their reports through the cycle.


Communicate and have a process around managing the monthly Power BI updates

Microsoft have a very good blog that lists all the new features and updates being released at https://powerbi.microsoft.com/en-us/blog/, as well as a forum for suggesting ideas on how Power BI can be improved. Our experience is that Microsoft are being very responsive to client feedback and some of our clients who have posted ideas have now seen this new functionality released and they are able to leverage it in their dashboards.

Would you like to be kept in the loop on courses, events and other related topics?

Simply complete your details and we’ll add you to our list.