Everything we measure and compute using these measurements has a margin of error. We separate the signal from the noise using Statistics.
R, Python, SQL, Docker, and Spark cover everything from small to big data, research and production, processing and visualization.
While not rocket science, Machine Learning, and especially DeepLearning (TensorFlow / PyTorch), can definitely fly a rocket.
Visualization tools such as ggplot2, Shiny, plotly, d3js, can create excellent graphs with a touch of interactivity.
While Research and Agile Prototyping are fun, Production Deployments must be built using best practices.
Motivation might provide the spark, but Perseverance drives the steady pace race.
Data does not exists in a vacuum, it comes with context and human knowledge. Your Domain Specific Knowledge is paramount in answering questions about the data and guiding the project.
A quick look at the data brings more questions, but also removes unfeasible paths from the analysis. Early actionable insights are always welcomed.
Machine Learning algorithms are applied to create and validate forecasts. Optimization, including parallelization, is employed as needed.
We have the forecast / prototype / deployment. Does it make sense? Can we improve its performance? Does it scale?
Accomplished Data Scientist with advanced quantitative skills in various facets of economics, finance, marketing, and statistics. More than fifteen years of analytical and research experience, both professionally and in academia.