Big Data

Data Management and Exploratory Data Analysis

The scientific method – Question, research, hypothesis, experiment, analyse and conclusion The crisp method – Business understanding, data understanding, data preparation, modelling, evaluation and deployment Big data – volume, velocity, variety, veracity Reasons to use R: R is open use and free It is the language of statisticians You can combine R with Latex Text …

Introduction Big Data refers to the inability of traditional data architectures to efficiently handle the new datasets. Characteristics of Big Data that force new architectures are: Volume (size of the dataset) Variety (date from different sources) Velocity (rate of flow) Variability (the change in other characteristics) Descriptive analytics Data aggregation – such as grouping, sum, …

