Blog

Big data


Alessandro Boratti

Big data… what it really means 

Nowadays, people often bump into the term “Big Data”, but they are unlikely to understand its meaning properly. In fact, the term “big data” does not only represent what it says, in other words, a large volume of data, but it refers especially to everything we can do with it, namely all the algorithms that can process it in little and with few computational resources. For this reason, it is worth to cite a quote that sums up the real importance of big data:

 “It’s important to remember that the primary value from big data comes not from the data in its raw form, but from the processing and analysis of it and the insights, products, and services that emerge from the analysis. The sweeping changes in big data technologies and management approaches need to be accompanied by similarly dramatic shifts in how data supports decisions and product/service innovation.” Thomas H. Davenport

In fact, even though the term “big data” is relatively new, the actions of searching, gathering and processing amount of information were already established in the twentieth century. Throughout the years, these actions have constantly gained more importance alongside the development of technologies that could speed them up. The concept of “big data” gained momentum in the beginning of the 2000s, when the industry analyst Doug Laney linked it to the three Vs: volume, velocity, variety. The new millennium’s data are collected from a variety of sources, including business transactions, social media, public web, media, documents and sensors attached to devices (internet of things); this data travels at an unprecedented speed that can be used in near-real time. Moreover, the data comes in a countless number of formats; from numeric databases to text documents, email, video audio or financial transactions. Finally, when talking about big data, we should always take into account the frequency - it can vary a lot and have many peaks or lows – and the complexity that it can implicate when it comes to connecting and correlating relationships, hierarchies, and linkages.

 

The challenges of big data and the support from IT 

Characteristics such as volume, velocity, variety or complexity, make big data a big challenge. In fact, from the gathering to the analyzation and maintenance of the data, there is a long way full of issues and pitfalls. It is tricky to identify the information needed, find a way to collect all the data and then, think about analysing it and keeping it safe from hackers’ attacks. In this contest, IT turns up to be essential, because it accounts for all the instruments that are used to face the issues.
There are at least five main challenges that big data users have to face every day:

1.  Storage of the big data. One of the key characteristics of big data applications is that they demand real-time or near real-time responses. For example, in the field of “IoT”, if a driver crashes his smart car, the data sent by the car has to be stored and be available immediately, so that the assistance can come as soon as possible.
The increasing volume of data demands high processing performance and very large capacity. The type of storage depends mostly on the application and usage patterns of big data. For example, when companies deal with social media, they usually use hyperscale; this compute architecture, embraced by Facebook and Google, uses many simple hardware-based nodes with direct attached storage (DAS) that connects big data to analytics environment such as Hadoop. Other methods commonly used are “object storage” and scale “out NAS”.

2.  Data integration. Information comes from countless sources that can contain correlated, and thus useful data. However, the volume of it is so large that it can also bring to redundant, fragmented, not update, inaccessible and incomprehensible information. For this reason, it is required to have software able to prepare the data before the processing carried out by analytics tools. In 2016, the most used software are IBM data integration, Dell Boomi, SAS data management and Microsoft’s SSIS.

3.  Visualization & analysis of big data. Big data is nothing without a presentation in a pictorial or graphical format that allows decision makers to spot patterns and identify concepts. The best tools can ease this process by comprehending information quickly, identifying relationships & patterns, forecasting emerging trends, and communicating the results. Amongst the most common software for big data analysis, there are Qlick, Oracle visual analyser, SAS visual analytics and Sisense.

4.  Search for big data. Once gathered and used, big data has to be collocated in databases, where it can be addressed to the original source. This represents an issue because it can take a long time before the data is released for search and re-use. Yet, available data can still be difficult to reach because of the big amount of information that can circulate, even within a single organization.
The role of many IT service companies is to provide tools able to do efficiently this kind of work; Algolia, Google Search Appliance, Constellio, Elasticsearch are just a few of the thousands of search engine available.

5.  Data safety and privacy. Information privacy and security is one of most concerned issues for big data due to its open environment with very limited user-side control. As Big data gets bigger and bigger, more sophisticated systems are required to save them from hacking. Companies and organizations (especially the political ones) invest a lot of money to protect their information in a world, where hacking may soon be one of the most threatening and detrimental crime. According to Forbes, cyber costs are projected to reach $2 trillion by 2019, worldwide; Root9B, Herjavec Group, Forcepoint and Ey represent the Top 4 the world hottest and most innovative cybersecurity companies.
 

Big data for companies is a key factor for future profitability… or survival

Big data is not only changing the world of business, but also that of health care, government policies and education. These are just few of the fields that are being shacked up by the use of big data. Basically, every field where there is data to collect and assess is subjected to it. But, why is it so important, especially in business? There are some reasons that seem prevalent: 

Big data can unlock significant value by making raw information transparent and usable at a higher frequency. 
As organizations create and store more transactional data in digital form, they can collect more accurate and detailed performance information on everything from product inventories to sick days, and therefore expose variability and boost efficiency and performance. 
It allows ever-narrower segmentation of customers and therefore, more precisely tailored promotions, products or services. 
Sophisticated analytics can improve the decision making process of the board of directors. 
It can be used to improve our services, products, daily life actions, in other words our welfare. In fact, as cited above, one of the biggest sources of big data is internet of things, the interaction between smart devices and internet; IoT is set to turn upside down our world in the next generation. 

According to Wipro, companies working in the retailing field gain 49% in productivity on average; consultancy firms gain up to 39% and so forth. In business, it is clear that it becomes crucial to approach big data as soon as possible. Unfortunately, there are companies that either do not know about it or have not decided to deal with it yet: this could cost them a lot in terms of market share and profitability in the foreseen future. Yet, big data might pass from being a golden resource to an essential asset for business, soon. Managers and entrepreneurs who will continue to rely on instinct will be likely to miss the “big picture” and eventually fall apart.

 

Reference list

Cluodtalk

Mckinsey

SAS

SAS

Computerweekly