Book Notes: Big Data at Work - Davenport

  2017-11-23


Big Data at Work - Davenport

Why Big Data Is Important to You and Your Organization

    what is?
        attention to: size
        most difficult: lack of structure
    big size
        not important
        biggest obstacle: impose structure
    who are you
        executives
            marketing
            supply chain
            finance
            hr
        industries
            retail
            travel and transportation
            telecom
            media
            finance

ch 02 How Big Data Will Change Your Job, Your Company, and Your Industry

    which industries are affected by dig data?
        industries that
            move things
            sell to consumers
            employs machinery
            sells or uses content
            provides service
            has physical facilities
            involves money
    Four future scenarios
        business travel
    Telecom Firms

ch 03 Developing a Big Data Strategy

ch 07 What You Can Learn from Start-Ups and Online Firms

    Use Big Data for Product and Service Innovation
        uses in
            developing new product/service
                mostly in online
            difficult in tangible goods
                common approaches in 
                    how products are used
                    inform customers
                        driving more energy-efficiently
                    embedded in product
                        ex: tesla model s
                            monitor performance remotely
                            signal need for maintenance
                            comparison with other drivers
        large organizations?
            ways
                separate business unit
    Work on Tools, Not Just Applications
        hadoop, hive, cassandra
    Give Data Scientists Their Heads
        autonomy
        big data at linkedin
    Address the Productivity of Big Data Work
        virtual data marts - ebay
            data analysis environments
                eases creation of data sets
                    don't replicate existing data
                    allow unique views
            data lab - teradata
        data hub
            facilitates sharing of data, algorithms
        pivotal chorus - emc greenplum 
        automated a/b testing - linkedin
            1000 tests per day
    Contribute to the Commons
        Voldemort - distributed storage system
    Remember: Agile is too slow
    Big Data at Kyruus
        provides data to hospitals, insurance, pharma
            to understand physician networks
            "Bloomberg of physician information"
        data sources
            hr databases
            physician credentialing systems
            electronic patient records
            supply chain databases
            +1000 sources
            most compelling: leakage of patient referrals outside of hospital
                average: 50%
        3 major groups
            data acquisition, integration, processing
            analytics
            applications and ui
    Take Advantage of Free and Low-Cost Stuff
        cloud
            amazon ec2
            google appengine
            ms azure
            cost effective
                more important: switching easy
        open source software
            data management tools
                hadoop, pig, hive
            R
                SAS: 
                    ready speacialized solutions:
                        to reduce credit card fraud
                        to optimize revenue in travel
    Experiment on a Large Scale
        randomized, controlled experiments
            a/b testing
        google
            10K tests a year
            on
                search
                advertising
        ebay
            web site designs
            offline tests
                lab studies
                home visits
        tools
            ebay platform
                leads process
                keep track what's being tested
            optimizely
                a/b testing of websites
            applied predictive technologies
                management of testing for offline tests
            linkedin
                multiple changes in a page
    Foster Close Collaboration
        gathering skills
            assembled teams of different skills
        people in one room
            use hackathons
    Lessons Not Learned by Start-Ups and Online Firms
        not sharing data with customers
            they don't let you know what data they have about you
        collecting data for data's sake
        big data at Recorded Future
            predicting future events
            source:
                websites, blogs, twitter
                8 billion events and entities
            Foresite
                linguistic processing
                scoring of events, entities
            intelligence agencies
                predictions for 
                    terror
                    political unrest
            private sector
                hedge funds
                    market intelligence
        Operating Analytics - Boston
            for hospitals to optimize use of operating rooms
                ORs expensive
                utilization < 50%
        Talking too much about technology

ch08 - What You Can Learn from Large Companies Big Data and Analytics 3.0

    how new?
        for large firms: it exists for a while
        impresses them
            lack of structure
            power
            low cost
        it is about variety, not volume
            ex: 
                retail bank
                    first time: analyzing log files
                        understanding multichannel customer interactions
                        easier than transaction systems
                hotel firm
                    analyzing customer waiting lines with video analytics
                health insurer
                    predict customer dissatisfaction by analyzing speech-to-text from call center recordings
            sensors and operational devices
                more structured data 
                sensors in devices
                    GE, UPS, Schneider
                    estimates
                        66 B$ savings from more efficient gas turbines
                        1% fuel reduction in aircraft => 30 B $ savings in 15 years
                Big Data at UPS
                    telematics sensors in 46 K vehicles
                        speed, direction, braking, drive performance
                    goal
                        redesign drivers' route structures
                    ORION (on-road integrated optimization and navigation)
                        largest operations research project
                    savings
                        8.4 M gal of fuel in 2011
                        85 M less miles 
    Examples of Big Data Objectives in Large Companies
        ceasars entertainment
            data from
                loyalty program
                web clickstreams
                real-time play in slot machines
            if customers have bad luck, they will never come back
                offer free meal coupon while they are at slot machine
        to support internal business decisions
        Big Data at United Healthcare
            uses NLP to understand customer satisfaction
            data source
                call center recordings
                    process
                        calls -> text -> hadoop -> nlp 
                mayo clinic medical records
                claims from United
                    to understand treatment of disease
    Integrating Organizational Structures and Skills 
        Big Data at Macys.com
            retailer
            Kerem Tomak
            talent
                phd in science
                strong programmers with analytical skills
        Big Data at Bank of America
            emphasis on integrated approach
            three buckets
                big transactional data
                daat about customers
                unstructured data
            projects
                understanding customers
                presenting offers to well defined segments
                providing cash-back offers