The term ‘big data’ is widely discussed at the moment and leading companies across diverse industries are taking note of how the analysis of information they collect can drive business improvement.
So what is big data? Well, software engineers and computer scientists would define big data as large amounts of information that are difficult to deal with using normal tools and techniques.
Problems encountered may include gathering, storage, search, sharing or analysis.
The datasets are becoming large as information is now often being collected or combined as a single set, rather than placing the information into smaller, easier to manage, separate chunks. These large datasets can provide powerful insights into trends and correlations within fields such as medical research, crime prevention, metrology and astronomy.
Of course where big data means big money is the fields of business and finance. Companies are investing huge amounts in collecting and analysing consumer behaviour and buying data. Vast amounts of information is now being gathered through in-store and online customer surveys, telephone calls, recording of purchase behaviours, mobile technology usage, CCTV footage and more.
The definition of big data will vary from one organisation to another and is a relative term. For one it may mean moving from the ability to store information on a single computer to requiring large server to manage gigabytes of data. For another, it may mean thousands of servers managing multiple datasets containing terabytes or petabytes. Even the later quantities will soon seem tiny to many organisations though; over the last 3 decades, the average quantity of data that can be stored per capita worldwide has more than doubled every 3 years.
Doug Laney of Gartner, formerly the META Group has described the challenges and opportunities involved with growing data sets as being 3-dimensional, mentioning the required considerations to:
…which have also been termed “the 3Vs” model.
While the collection, collation and management of big data can be exciting and hugely powerful, its discussion has also been criticised. A number of academics have stated that excessive focus and budget is being place on the infrastructure to cope with large amounts of data, rather than being selective about what to collect. This, some believe, comes at the expense of skilled data selection and management, while organisations lack the experience and ability to deal with all this information.
Data security is also a concern for many, and massive data storage could mean massive risk if it is not kept securely. This issue becomes more complex as demand grows for fast accessibility to data – more and more enabled by cloud computing.
What’s clear is that big data is going to be high on the agenda of most of the world’s major organisations over the next few years. In 2012, The White House alone committed $200 million to a research project know as the ‘Big Data Initiative’.