Monday's Musings: Beyond The Three V's of Big Data - Viscosity and Virality

February 27, 2012

Revisiting the Three V's of Big Data

It's time to revisit that original post from July 4th, 2011 post on the the Three V's of big data. Here's the recap:

Traditionally, big data describes data that's too large for existing systems to process. Over the past three years, experts and gurus in the space have added additional characteristics to define big data. As big data enters the mainstream language, it's time to revisit the definition (see Figure 1.)

Volume. This original characteristic describes the relative size of data to the processing capability. Today a large number may be 10 terabytes. In 12 months 50 terabytes may constitute big data if we follow Moore's Law. Overcoming the volume issue requires technologies that store vast amounts of data in a scalable fashion and provide distributed approaches to querying or finding that data. Two options exist today: Apache Hadoop based solutions and massively parallel processing databases such as CalPont, EMC GreenPlum, EXASOL, HP Vertica, IBM Netezza, Kognitio, ParAccel, and Teradata Kickfire

Velocity. Velocity describes the frequency at which data is generated, captured, and shared. The growth in sensor data from devices, and web based click stream analysis now create requirements for greater real-time use cases. The velocity of large data streams power the ability to parse text, detect sentiment, and identify new patterns. Real-time offers in a world of engagement, require fast matching and immediate feedback loops so promotions align with geo location data, customer purchase history, and current sentiment. Key technologies that address velocity include streaming processing and complex event processing. NoSQL databases are used when relational approaches no longer make sense. In addition, the use of in-memory data bases (IMDB), columnar databases, and key value stores help improve retrieval of pre-calculated data.

Variety. A proliferation of data types from social, machine to machine, and mobile sources add new data types to traditional transactional data. Data no longer fits into neat, easy to consume structures. New types include content, geo-spatial, hardware data points, location based, log data, machine data, metrics, mobile, physical data points, process, RFID’s, search, sentiment, streaming data, social, text, and web. The addition of unstructured data such as speech, text, and language increasingly complicate the ability to categorize data. Some technologies that deal with unstructured data include data mining, text analytics, and noisy text analytics.

Figure 1. The Three V's of Big Data

Contextual Scenarios Require Two More V's
In an age where we shift from transactions to engagement and then to experience, the forces of social, mobile, cloud, and unified communications add two more big data characteristics that should be considered when seeking insights. These characteristics highlight the importance and complexity required to solve context in big data.

Viscosity - Viscosity measures the resistance to flow in the volume of data. This resistance can come from different data sources, friction from integration flow rates, and processing required to turn the data into insight. Technologies to deal with viscosity include improved streaming, agile integration bus', and complex event processing.
Virality - Virality describes how quickly information gets dispersed across people to people (P2P) networks. Virality measures how quickly data is spread and shared to each unique node. Time is a determinant factor along with rate of spread.

Figure 2. The Five V's of Big Data

The Bottom Line: Big Data Provides The Key Element In Moving From Real Time To Right Time

Context represents the next frontier as we move to intelligent systems. Big data systems and techniques will provide the key infrastructure in delivering context within business processes, across relationships, by geo spatial position, and within a time spectrum. As engagement systems make the shift to experiential systems, expect context to provide the key filter in improving signal to noise ratios. Big data provides the context required to move from real time to right time.
Catch Constellation's Big Data Coverage From VP and Principal Analyst - Neil Raden

Upcoming Report: Analytics in the Organization: Types, Roles and Skills
“Analytics” is a critical component of enterprise architecture capabilities, though most organizations have only recently begun to develop experience using quantitative methods. This report discusses the role of analytics, why it is a difficult topic for many, and what actions you should take. It lays out the various meanings of analytics, provide a framework for aligning various types of analytics with associated roles and skill sets needed.
Blog Post: What Is a Data Scientist (and What Isn't)
Big Data doesn't happen by itself. Because the tools and techniques are different from traditional Data Warehousing/Business Intelligence approaches, big Data requires different skills. This role has become known as the Data Scientist. Have a look at analyst Neil Raden's take on the data scientist.
Watch for the following:

Here is all my stuff: Select what you like:

Understanding Data: Mechanical MDM, Ontology, Machine Learing

Future of IBM's Watson

Tainted Truth: How to Read Statistical Research

noSql: The End of the Relational Database

Analytical Platforms: Revenge of the Relatioal Database

Next Wave of BI

The Data Scientist

Planning and Performance Management Supercharged with ANalytics

Hadoop vs. ETL vs. ELT

CEP: From Product Class to Wider Application

Real-Time Decision-Making: Where It Fits

Are Rules-Based Management Systems Dead?

Skills Checklist for Big Data

Skills Checklist for Business Analytics

Interactive Data Visualization

Let the Gorillas Write the Script: Forget Requirements

Data Warehouse Rescue: What to Do with your Legacy Warehouse

BI Rescue: What to Do with your Legacy BI

Your POV
What business problem will require you to start with Big Data? What are the key outcomes? Where do you expect to move the needle? Add your comments to the blog or send us a comment at R (at) SoftwareInsider (dot) org or R (at) ConstellationRG (dot) com
Resources

Reprints
Reprints can be purchased through Constellation Research, Inc. To request official reprints in PDF format, please contact Sales .
Disclosure
Although we work closely with many mega software vendors, we want you to trust us. For the full disclosure policy, stay tuned for the full client list on the Constellation Research website.
* Not responsible for any factual errors or omissions. However, happy to correct any errors upon email receipt.
Copyright © 2001 -2012 R Wang and Insider Associates, LLC All rights reserved.
Contact the Sales team to purchase this report on a a la carte basis or join the Constellation Customer Experience!

Cupertino Ray Wang Ray R Wang crm; customer data integration customer relationship management customer relationship management (CRM) informatica R "Ray" Wang; rwang0 Software Insider SoftwareInsider vendor events #bigdata analytics BI Big Data business analytics business intelligence business technology business value CalPont colummnar database columnar database complex event processing Constellation Research Consumerization of IT content Data deluge data governance data quality data streaming enterprise applications enterprise apps Enterprise apps strategy Enterprise Software enterprise strategy EXASOL geo-spatial hardware data points HP Vertica IBM Netezza in memory database Kickfire Kognitio location based log data machine data metrics mobile Monday's Musings Netezza ParAccel physical data points process RFID’s Search sentiment social social service social support SocialText socialytics streaming data teradata Teradata Kickfire text Variety Velocity Vertica Volume web community platforms disruptive disruptive technologies disruptive technology Executive Profiles Forbes Gamification Lithium Lithium Technologies Lyle Fong SCRM Social Business social business software Social CRM Thursday's Tech Showcase Azure Bob Kelly cloud computing Cloud options Cloud Services Micorosoft Cloud Platform Microsoft Microsoft Windows Azure CRSH customer experience management Internet of Things Michael Dortch New Members Press Release RFID star analyst StarAnalyst analyst relations analyst strategy Edelman Influence influencer relations Jonny Bentwood Product Review social analytics social media social media monitoring TweetLevel Twitter $GOOG $MSFT $YHOO Apps Strategy best practices CEO cfo CIO CMO Daily deal sites DealUmpire e-commerce Facebook Google Groupon LivingSocial market strategy marketing marketing automation socbiz social commerce social enterprise social marketing insights Advanced Analytics aerospace & defense data integration data stewardship data warehouse database Decision Services industry analyst $N Ensw ERP NetSuite SaaS Software as a Service Zach Nelson collaboration collaboration software David Sacks e2conf Enterprise 2.0 enterprise social networking Yammer Al Nugent Mzinga Alcatel-Lucent Charlie Isaacs Unified Communications cloud integration Constellation Research Inc. Gaurav Dhillon Hybrid integration integration saas integration SnapLogic app dev application development enterprise architecture mobile device management Mobile Enterprise ASUG Australia Australian SAP User Group Bridgette Chambers Business Objects ByD Christian Thompson CITIC Pacific Mining Cloud Enterprise Business Apps event report Fonterra Graham Robinson IBM Jeff Word John Kelvie John Moy Keith Murray Malcolm Humphires Paul Hawking SAP SAP Australian User Group SAP HANA Tony de Thomasi user conference user event user group event user group events user groups Adobe Adobe Omniture Angel Art Technology Group ATG Attensity Attensity Group Avaya Bazaarvoice Broadvision Broadvision Clearvale Cisco Cisco Quad Clearvale customer engagement customer service e20 early adopters early adoptions early movers eGain Endeca enterprise collaboration Genesys Get Satisfaction GetSatisfaction GooglePlus IBM Connections IBM Lotus ibm software group INgage Networks Jive Jive Software KickApps Microsoft Lync Microsoft Sharepoint middleware middleware platforms Moxie NewsGator Omniture Oracle Oracle ATG Oracle Beehive RightNow RightNow Technologies Saba Salesforce Chatter SalesForce.com SAP Streamworks SharePoint social middleware social technologies Socialcast Teleperformance Telligent Tibbr Tibco Tibco Tibbr VMWare VMware SocialCast West David Bankston Neighborhood America social enterprise apps Consumer Tech innovation Software strategy #DF11 BunchBall customer experience design thinking News Analysis Nitro elements Nitro for salesforce Board of Advisors C-PET Center for Policy on Emerging Technologies emerging technologies Tech policy Technology policy Think Tank #socialenterprise Customer Support enterprise class HCM HR HR Tech human capital management human resources PLM product lifecycle project based solutions Projects Public relations sales sales force automation service social campaign tracking social customer insights social event management social sales insights social support insights $crm Salesforce.com Co-Founder data.com database.com Dreamforce Parker Harris Radian 6 Daniel Debow Danield Debow Performance Management Rypple Customer Hubs Customer References customer service; -business Engagement Apps Rob Tarkoff social business strategist Social Media Club Data centers disaster recovery Extreme Networks Networking Oscar Rodriguez virtualization Alexandre Mesquita Argentina Brazil BRIC Caracas Chile China Columbia Forrester Research Gov 2.0 India LatAm Lisboa Lisbon Madrid PBS Portugal PwC Consulting research Research Firms Rio de Janiero Russia Sao Paulo SCM Spain UC University of Brasília Venezuela Polycom Sudhakar Ramakrishnan unifed communications Digital identity future future of business Future of money Futurist Identity management Trust networks Venessa Miemis Virtual currency Virtual goods Activision AICPA AMP Pty Ltd. API Healthcare Aristocrat Bordine's brand monitoring Chirch Global Manufacturing Christiana Care Health System Cloud BPO Constellation SuperNova Awards Core4 Research CRM Magazine Dell Computer Enterasys Fetch Technologies Flextronics Inc. giftgaff Gilt Huntington Bank innovation insights Inteva Products ITS Jefferson County Colorado Kelly Services Preferred Unlimited Psion Shape Corporation Sony Electronics Texas Instruments Ultimate Software XSP Charles Phillips Enterprise Business Apps Vendors Infor Infor 10 Infor ION Lawson Lawson M3 Lawson S3 Lawson Software nalyst event public sector purpose built supply chain suppy chain management Bobby Yazdani collaboration insights Saba Software Alan Lepofsky activity based costing Barry Wilderman budgeting business outcomes business performance management dashboards enterprise performance management EPM financial consolidation financial disclosure financial reporting forecasting planning scorecards strategy management acquisition acquisitions mergers mergers and acquisitions Oracle CRM On Demand Oracle Fusion Apps Oracle Fusion Middleware vendor strategy Chief People Officer CHRO future of work people processes people technology talent management Yvette Cameron Academic Alea Fairchild compliance e-Europe EU IT Strategy The Constantia Institute Tilburg University Vesalius College Vrije Universiteit Bil of Rights Cross Channel Commerce Enterprise Irregulars information management Legacy Optimization Personal Log Quarks Visible Technologies B2B Integration boomi CoIT dell email Harmon.ie Microsoft Office Webinar next generation Next generation apps SaaS strategies Avangate software ownership software ownership lifecycle software vendors bill mcdermott Business Suite 7 Cloud Wars CubeTree NYSE: SFSF NYSE:SAP SaaS offensive SAP Business By Design SAP OnDemand Large Enterprise Success Factors SuccessFactors Badgeville Big Door ConstellationRG CrowdTwist Enterprise Gamification iActionable nextgencxp Research Report Research Summary business strategy Corporate Digital Divide Corporate Strategy Data visualization Digital Divide Global customers Ownership Experience Ownership lifecycle P2P People to People Regulation Visualization tools Worsening regulation Randy Guard SAS Institute Adam Rogers CTO hr technology strategic HCM #crosschannelcommerce #futureofwork #techoptimization Digital Marketing Transformation Multi Channel Commerce Next Gen C-Suite Next Gen CXP Next Generation Customer Experience Pune Sachin Gosavi SachinGo Technology Optimization Brand business execs Business Hiearchy of Needs C-Suite Chief Collaboration Officer Chief HR Officer Chief Information Officer Chief Marketing Officer Chief Procurement Officer Chief Sales Officer COO Opearational Efficiency Revenue Growth Strategic Differentiation #socialbusiness Benchmark Capital DAG ventures Emergence Capital Greenspring Associates New Enterprise Associates Peter Sonsini Shasta Ventures Tenava Capital bill of rights Contract Negotiations contract strategy deployment options Enterprise Irregular Enterprise Software Licensee Bill of Rights software bill of rights software contract reviews software licensing software licesing and pricing software maintenance software pricing software revenue recognition rules software trends Tuesday's Tip B2B B2B E-commerce B2C B2C E-commerce customer bill of rights Electronic Privacy Federal Trade Commission LinkedIn Matrix Commerce Next Gen Customer Privacy Privacy Rights Foursquare Quips Quora YouTube Andreesen Horowitz funding next gen enterprise Redpoint Ventures Greenplum Brad Smith Intuit Intuit Quick Books Quicken Scott Cook TurboTax American Airlines Bersin & Associates best of breed Blue Cross blue Shield Conde'Nast hp Hyatt JP Morgan Chase next gen Next Gen Apps NextGen Oracle Public Cloud PaaS PeopleSoft saas bigots Starbucks Starwood Sutter Health Taleo Tesora United business value framework BVF Chief Customer Officer CXP MIcrosoft Dynamics CRM Oracle Siebel SAP CRM Siebel Siebel Systems Business by Design Consumer apps iOS iPHone Objective C On Demand SAP Recalls context EMC virality viscosity