Data Science Mastery – Top Courses and Specializations

Data Science Mastery - Courses and Specializations Compiled

Last updated on May 8th, 2018 at 01:50 pm

 

Data Science Mastery - Courses and Specializations Compiled

 

Data Science Mastery .

Courses and Specializations Compiled

Coursera is one of the best places to learn professional skills online. Coursera has three different kinds of learning tracks, Courses, Specializations and degrees.  I will talk about Courses and specializations. Courses, as the name suggests are individual classes / courses which cover a certain area or topic. Courses make up specializations. A specialization is a group of courses geared towards making you an expert in a specific field. For instance, in this article, I have mentioned 12 different specializations, and in each specialization, I will list the courses that comprise it.

 

Courses on Coursera can be watched for free, however to take part in assignments and to attain certificates you will need to be on a paid plan. Specializations on the other hand, require you to be a paid subscriber. The subscriptions are usually 49 USD or 99 USD depending on the specialization or course in question. If you just want to take a course for a spin ,  you can join it for free and watch all the videos in it’s curriculum.

 

In this article I have listed 12 data Science Specializations. In each specialization I have listed all the courses that make up the specialization. You don’t have to study them all, Just study one Specialization and most importantly try to get a job in the field of data engineering or data Science. After that you will be well suited to know which specialization or course you should join next, in order to improve your skills.

 

Here are the data Science Courses and specializations that will help you become a Data Science Expert:

 

#1 Data Science Specialization

From asking the right kinds of questions to publishing results and making inferences, this Specialization created by the Johns Hopkins University, covers all the tools and concepts that you’ll need throughout the entire data science pipeline.

 

The courses that make up this Coursera Specialization are :

 

 a) The Data Scientist’s Toolbox

Created by the Johns Hopkins University, it’s the first course of Data Science Specialization; in which you will get an introduction to the main ideas and tools in the data scientist’s toolbox. The course gives an overview of the questions, tools and data that data scientists and data analysts work with.

The course comprises of two components. The first component is a conceptual introduction to the ideas behind turning data into actionable knowledge. The second is a practical introduction to the tools that will be used in the programs like:

  • Version control
  • Markdown
  • Git
  • GitHub
  • R
  • RStudio

b)  R Programming

Created by the Johns Hopkins University, it’s the second course of Data Science Specialization. In which you will learn how to use R for effective data analysis and how to program in R. You will learn how to configure and install the necessary software for a statistical programming environment. And describe generic programming language concepts as they are implemented in a high-level statistical language. The course covers practical issues in statistical computing which includes:

  • programming in R,
  • reading data into R,
  • accessing R packages,
  • writing R functions,
  • debugging,
  • profiling R code,
  • organizing and commenting R code

c)  Getting and Cleaning Data

Created by the Johns Hopkins University, this third course of Data Science Specialization will cover the basic ways that data can be obtained from databases, from APIs, from colleagues and from the web in various formats.

It will also cover the basics of how to make data clean & “tidy”. The course will also cover the components of a complete data set including:

  • raw data,
  • processing instructions,
  • codebooks,
  • And processed data.

The course will cover the basics needed for cleaning, sharing & collecting data.

d) Exploratory Data Analysis

Created by the Johns Hopkins University, this fourth course of Data Science Specialization covers the essential exploratory techniques for summarizing data. Applied before formal modeling commences, these techniques can help inform the development of more complex statistical models.

Exploratory techniques are also important for eliminating, or sharpening potential hypotheses about the world that can be addressed by the data. We will cover in detail some of the basic principles of constructing data graphics, as well as the plotting systems in R. Some of the common multivariate statistical techniques which are used to visualize high-dimensional data are also covered.

e) Reproducible Research

Created by the Johns Hopkins University, this fifth course of Data Science Specialization focuses on the tools & concepts behind reporting modern data analyses in a reproducible manner. Reproducible research is the idea that data analyses are published with their software code and data so that others may verify and build upon the findings.

As data analyses becomes more complex (involving larger datasets and more sophisticated computations), it also increases the need for reproducibility as well. Instead of focusing on superficial details reported in a written summary, reproducibility allows for people to focus on the actual content of a data analysis.

To allow one to publish data analyses in a single document, this course focuses on literate statistical analysis tools which allows others to easily execute the same analysis to obtain the same results.

f) Statistical Inference

The process of drawing conclusions about scientific truths or populations from data is called Statistical inference. There are many modes of performing inference including:

  • Statistical modeling,
  • Data oriented strategies
  • And explicit use of randomization and designs in analyses

Furthermore, there are broad theories (Bayesian, frequentists, design based, likelihood,  …) and numerous complexities (observed and unobserved confounding, missing data, biases) for performing inference.

A practitioner can often be left in a debilitating maze of philosophies, nuance and techniques. The fundamentals of inference are presented by this course in a practical approach for getting things done. Students will understand the broad directions of statistical inference at the end of this course, and will be able to use this information for making informed choices in analyzing data.

g) Regression Models

Regression models, a subset of linear models, are the most important statistical analysis tool in a data scientist’s toolkit.

Created by the Johns Hopkins University, this seventh course of Data Science Specialization covers least squares, inference and regression analysis using regression models. ANOVA and ANCOVA (Special cases of the regression model) will be covered as well. Analysis of variability & residuals will be investigated. The course will cover novel uses of regression models including scatterplot smoothing, as well as modern thinking on model selection.

h) Practical Machine Learning

Machine learning and prediction are one of the most common tasks performed by data analysts and data scientists. Created by the Johns Hopkins University, this eighth course of Data Science Specialization will cover the basic components of applying and building prediction functions, with an emphasis on practical applications.

The course will provide basic grounding in concepts such as overfitting, training and tests sets, and error rates. The course will also introduce a range of model based and algorithmic machine learning methods including:

  • Regression,
  • Classification trees,
  • Naive Bayes,
  • And random forests

The complete process of building prediction functions  is also covered including feature creation, data collection, evaluation & algorithms.

i) Developing Data Products

Created by the Johns Hopkins University, this ninth course of Data Science Specialization covers the basics of creating data products using:

  • Shiny,
  • R packages,
  • And interactive graphics

The course will focus on the statistical fundamentals of creating a data product that can be used to tell a story about data to a mass audience.

j) Data Science Capstone

The capstone project class will be drawn from real-world problems & will allow students to create a public/ usable data product, that can be used to show your skills to potential employers. It will be conducted with government, industry, and academic partners.

 

#2 Introduction to Applied Data Science Specialization

Created by IBM, in this Specialization learners will develop foundational Data Science skills. The specialization focuses on understanding what Data Science is, and what are the various kinds of activities performed by a Data Scientist.

It will familiarize students with various open source tools used by Data Scientists (like Jupyter notebooks). The specialization provides knowledge about methodology involved in tackling data science problems.  It also teaches the use of SQL to query databases & relational database concepts. Learners will have complete hands-on projects and labs to apply their newly acquired knowledge & skills.

 

The courses that make up this Coursera Specialization are :

 

a) What is Data Science?

Created by IBM, in this course the learners will meet some data science practitioners and will get an overview of what data science is today.

b) Open Source tools for Data Science

What are some of the most popular data science tools?, what are their features? How do you use them? Created by IBM, in this second course of Introduction to Applied Data Science Specialization, you’ll learn about:

  • Jupyter Notebooks,
  • RStudio IDE,
  • Apache Zeppelin,
  • And Data Science Experience

You will learn about what each tool is used for, what are their features and limitations, what programming languages they can execute.

c) Data Science Methodology

Created by IBM, it’s the third course of Introduction to Applied Data Science Specialization. Its purpose is to share a methodology that can be used within data science, to ensure that the data used in problem solving is relevant and properly manipulated to address the question at hand.

Accordingly, in this course, you will learn:

  • The major steps involved in tackling a data science problem.
  • The major steps involved in practicing data science, from forming a concrete business or research problem, to collecting and analyzing data, to building a model, and understanding the feedback after model deployment.
  • How data scientists think!

d) Databases and SQL for Data Science

Created by IBM, the purpose of this course is to help you learn and apply knowledge of the SQL language & introduce relational database concepts. Using a cloud-based environment, learners will practice running & building SQL queries hands-on, also they’ll learn how to access databases from Jupyter notebooks using Python.

 

#3 Applied Data Science with Python Specialization

This specialization created by the University of Michigan, introduces learners to data science through the python programming language. Intended for learners with basic python or programming background, and who wants to apply the following techniques:

  • Statistical,
  • Machine learning,
  • Information visualization,
  • Text analysis,
  • And social network analysis

through popular python toolkits such as matplotlib, pandas, nltk, scikit-learn, and networkx to gain insight into their data.

 

The courses that make up this Coursera Specialization are :

 

a) Introduction to Data Science in Python

Created by the University of Michigan, it’s the first course of Applied Data Science with Python Specialization. The learners are introduced to the basics of the python programming environment, including fundamental python programming techniques such as:

  • lambdas,
  • reading and manipulating csv files,
  • and the numpy library

Using the popular python pandas data science library, the course will introduce data manipulation and cleaning techniques. And will also introduce DataFrame as the central data structures for data analysis, along with tutorials on how to use functions such as merge, groupby, and pivot tables effectively. After taking the course, students will be able to take tabular data, manipulate it, clean it, and run basic inferential statistical analyses.

b) Applied Plotting, Charting & Data Representation in Python

Created by the University of Michigan, this second course of Applied Data Science with Python Specialization will introduce the learner to information visualization basics, with a focus on charting and reporting using the matplotlib library.

c) Applied Machine Learning in Python

Created by the University of Michigan, this third course of Applied Data Science with Python Specialization will introduce the learner to applied machine learning, focusing more on the techniques and methods than on the statistics behind these methods.

d) Applied Text Mining in Python

Created by the University of Michigan, this fourth course of Applied Data Science with Python Specialization will introduce the learner to text mining and text manipulation basics.

It starts with an understanding of how python handles text, the structure of text both to the humans and machines, and an overview of the nltk framework for manipulating text. The second week focuses on common manipulation needs, including:

  • regular expressions (searching for text),
  • cleaning text,
  • and preparing text for use by machine learning processes.

The basic natural language processing methods are applied to text in the third week, and demonstrated how text classification is accomplished. More advanced methods for detecting the topics, in documents and grouping them by similarity (topic modeling) are explored in the final week.

e) Applied Social Network Analysis in Python

Created by University of Michigan, this fifth course of Applied Data Science with Python Specialization will introduce the learner to network analysis through tutorials using the NetworkX library.

The course begins with an understanding of what network analysis is and motivations for why we might model phenomena as networks. The concept of connectivity and network robustness is introduced in the second week.

The third week will explore ways of measuring the importance or centrality of a node in a network. The final week will explore the evolution of networks over time and cover models of network generation and the link prediction problem.

 

#4 Data Science at Scale Specialization

Created by the University of Washington, this Specialization covers intermediate topics in data science. You will gain hands-on experience with:

  • Scalable SQL and NoSQL data management solutions,
  • Data mining algorithms,
  • Practical statistical and machine learning concepts

You will also learn to visualize data and communicate results, and you’ll explore legal and ethical issues that arise in working with big data.

 

The courses that make up this Coursera Specialization are :

 

a) Data Manipulation at Scale: Systems and Algorithms

Created by the University of Washington, it’s the first course of Data Science at Scale Specialization. In it you will learn the landscape of relevant systems, their tradeoffs, how to evaluate their utility against your requirements and the principles on which they rely.

You will learn how from the frontier of research in computer science the practical systems were derived, and what systems are coming on the horizon. Spark and its contemporaries, SQL and NoSQL databases, Cloud computing, MapReduce and the ecosystem it spawned, and specialized systems for graphs and arrays will be covered.

You will also learn the context and history of data science, how to structure a data science project, the challenges, skills, and methodologies the term implies.

b) Practical Predictive Analytics: Models and Methods

Created by the University of Washington, in this second course of Data Science at Scale Specialization you will analyze the results of your designed statistical experiments using modern methods.  The common pitfalls in interpreting statistical arguments will also be explored, especially those associated with big data.

Learning Goals:  After completing this course, you will be able to:

  • Design effective experiments and analyze the results
  • Use resampling methods to make clear and bulletproof statistical arguments without invoking esoteric notation
  • Apply & explain a set of classification methods of increasing complexity (trees, rules, random forests), and associated optimization methods ( variants & gradient descent)
  • Explain and apply a set of unsupervised learning concepts and methods
  • Describe a large-scale graph analytics common idioms, including traversals and recursive queries, structural query, community detection and PageRank.

c) Communicating Data Science Results

Created by the University of Washington, in this second course of Data Science at Scale Specialization you will learn to recognize, design, and use effective visualizations.

Learning Goals:  After completing this course, you will be able to:

  • Design and critique visualizations
  • Explain the state-of-the-art in ethics, privacy, governance around data science and big data.
  • Analyze large datasets in a reproducible way using cloud computing

d) Data Science at Scale – Capstone Project

Students will engage on a real world project in the capstone, requiring them to apply skills from the entire data science pipeline:

  • Preparing data,
  • Organizing data,
  • transforming data,
  • constructing a model,
  • and evaluating results

Through collaboration with Coursolve, each Capstone project is associated with partner stakeholders who have a vested interest in your results, and are eager to deploy them in practice.

 

#5 Executive Data Science Specialization

Created by Johns Hopkins University, in this specialization you will learn what you need to know to begin assembling, and leading a data science enterprise, even if you have never worked in data science before.

The courses that make up this Coursera Specialization are :

 

a) A Crash Course in Data Science

Created by Johns Hopkins University, this is a focused course designed to rapidly get you up to speed on the field of data science.

After completing this course you will know.

  • How machine learning, statistics, and software engineering play a role in data science
  • How to describe the structure of a data science project
  • Know the key terms and tools used by data scientists
  • How to identify an unsuccessful and a successful data science project
  • A data science manager role

b) Building a Data Science Team

In this one-week course created by Johns Hopkins University, we will cover how you can find the right people to fill out your data science team, how to organize them to give them the best chance to feel empowered and successful, and how to manage your team as it grows.

After completing this course you will know.

  • The different roles such as data scientist and data engineer in the data science team
  • What are the expected qualifications of different data science team members
  • Relevant questions for interviewing data scientists
  • How to manage the onboarding process for the team
  • How to encourage and empower data science teams as well as how to guide them to success

c) Managing Data Analysis

This one-week course created by Johns Hopkins University, describes the analyzing data process and how to manage that process. The iterative nature of data analysis is also described along with the role of stating a sharp question, exploratory data analysis, inference, formal statistical modeling, interpretation, and communication.

After completing this course you will know how to….

  • Describe the basic data analysis iteration
  • Identify different types of questions and translate them to specific datasets
  • Describe different types of data pulls
  • Determine if data are appropriate for a given question by exploring datasets
  • Direct model building efforts in common data analyses
  • Interpret the results from common data analyses
  • Integrate statistical findings to form coherent data analysis presentations

d) Data Science in Real Life

In this one-week course created by Johns Hopkins University, we contrast the ideal with what happens in real life. By contrasting the ideal, you will learn key concepts that will help you manage real life analyses.

After completing this course you will know how to:

  • Describe the “perfect” data science experience
  • Identify weaknesses and strengths in experimental designs
  • Describe possible pitfalls when pulling / assembling data and learn solutions for managing data pulls.
  • Challenge statistical modeling assumptions and drive feedback to data analysts
  • Describe common pitfalls in communicating data analyses
  • Learn about the life of a data analysis manager.

The course will be taught at a conceptual level for active managers of data scientists and statisticians.  Some key concepts being discussed include:

  • Experimental design, randomization, A/B testing
  • Causal inference, counterfactuals,
  • Strategies for managing data quality.
  • Bias and confounding
  • Contrasting machine learning versus classical statistical inference

d) Executive Data Science Capstone

The Capstone allows learners to apply what they’ve learned to a real-world scenario developed in collaboration with Zillow (a data-driven online real estate and rental marketplace), and DataCamp (a web-based platform for data science programming).

To demonstrate that you have what it takes to shepherd a complex analysis project from start to finish, you will have to lead a virtual data science team and make key decisions along the way. For the final project, you will prepare and submit a presentation, which will be evaluated and graded by your fellow capstone participants.

 

#6  Genomic Data Science Specialization

Created by Johns Hopkins University, this specialization covers the tools and concepts to analyze, interpret and understand data from next generation sequencing experiments. It teaches the most common tools used in genomic data science including how to use the:

  • Command line,
  • Python,
  • R,
  • Bio-conductor,
  • And Galaxy.

 

The courses that make up this Coursera Specialization are :

 

a) Introduction to Genomic Technologies

Created by Johns Hopkins University, this course introduces you to the modern genomics basic biology and the experimental tools used to measure it. The Central Dogma of Molecular Biology will be introduced along with covering how next-generation sequencing can be used to measure:

  • DNA,
  • RNA,
  • And epigenetic patterns

You’ll also get an introduction to the key concepts in data science and computing, that you’ll need to understand how data from next-generation sequencing experiments are generated and analyzed.

b) Genomic Data Science with Galaxy

Created by Johns Hopkins University, in this course you’ll learn to use the tools that are available from the Galaxy Project.

c) Python for Genomic Data Science

Created by Johns Hopkins University, in this course you’ll get an introduction to the iPython notebook and the Python programming language.

d) Algorithms for DNA Sequencing

Created by Johns Hopkins University, in this course you’ll learn algorithms, computational methods and data structures for analyzing DNA sequencing data. You will learn a little about genomics, DNA, and how DNA sequencing is used.  We will use Python to implement data structures and key algorithms and to analyze DNA sequencing datasets and real genomes.

e) Command Line Tools for Genomic Data Science

Created by Johns Hopkins University, this course Introduces you to the commands that you need to analyze and manage files, directories, and large sets of genomic data.

f) Bioconductor for Genomic Data Science

Created by Johns Hopkins University, in this course you’ll learn to use tools from the Bioconductor project to perform analysis of genomic data.

g) Statistics for Genomic Data Science

Created by Johns Hopkins University, this course is an introduction to the statistics behind the most popular genomic data science projects.

h) Genomic Data Science Capstone

In Capstone project, you will deploy the techniques & tools that you’ve mastered over the course of the specialization. You’ll work with a real data set to perform analyses and prepare a report of your findings.

 

Related articles about Coursera

”””””

 

#7 Big Data Specialization

Created by the University of California, San Diego, you will gain an understanding of what insights big data can provide in this specialization through hands-on experience, with the systems and tools used by big data engineers and scientists. You will be guided through the basics of using Hadoop with Spark, MapReduce, Hive and Pig.

 

The courses that make up this Coursera Specialization are :

 

a)  Introduction to Big Data

Created by the University of California, San Diego, this course is for those who want to become conversant with the core concepts and the terminology behind big data problems, systems and applications.

It provides an introduction to Hadoop (one of the most common frameworks), that has made big data analysis more accessible and easier, increasing the potential for data to transform our world!

At the end of this course, you will be able to:

  • Describe the Big Data landscape including examples of real world big data problems including the three key sources of Big Data: organizations, people, and sensors.
  • Explain the V’s of Big Data (velocity, volume, veracity, variety, valence, and value) and why each impacts monitoring, data collection, analysis, storage, and reporting.
  • By using a 5-step process to structure your analysis, get value out of Big Data.
  • Identify big data problems and be able to recast these problems as data science questions.
  • Provide an explanation of the programming models and architectural components used for scalable big data analysis.
  • Summarize the features and value of core Hadoop stack components including the YARN resource and job management system, the HDFS file system and the MapReduce programming model.
  • Install and run a program using Hadoop!

b)   Big Data Modeling and Management Systems

Created by University of California, San Diego in this second course of Big Data Specialization, you will experience various management tools and data genres appropriate for each.  You will be able to describe the reasons behind the evolving plethora of new big data platforms from the perspective of big data management systems and analytical tools.

You will become familiar with techniques through guided hands-on tutorials, using semi-structured and real-time data examples.  Systems and tools discussed include: HP Vertica, AsterixDB, Neo4j, Impala, Redis, and SparkSQL. This course provides techniques to discover new data sources and extracting value from existing untapped data sources.

At the end of this course, you will be able to:

  • Recognize different data elements in everyday life problems
  • Explain why your team needs to design an Information System Design and a Big Data Infrastructure Plan
  • Identify the frequent data operations required for various types of data
  • Select a data model to suit the characteristics of your data
  • Apply techniques to handle streaming data
  • Differentiate between a Big Data Management System and a traditional Database Management System
  • Design a big data information system for an online game company

c)  Big Data Integration and Processing

This is the third course in the Big Data Specialization from University of California, San Diego. At the end of the course, you will be able to:

  • Retrieve data from example big data management systems and database
  • Describe the connections between data management operations and the big data processing patterns needed to utilize them in large-scale analytical applications
  • Identify when data integration is needed by a big data problem
  • Execute simple big data processing and integration on Spark and Hadoop platforms

d)  Machine Learning With Big Data

Created by University of California, San Diego this course provides an overview of machine learning techniques to analyze, explore, and leverage data.  You will be introduced to algorithms and tools you can use to create machine learning models, that learn from data, and to scale those models up to big data problems.

After completing the course, you will be able to:

  • Use the steps in the machine learning process, to design an approach to leverage data
  • Apply machine learning techniques to prepare and explore data for modeling.
  • Identify the type of machine learning problem in order to apply the appropriate set of techniques.
  • Construct models that learn from data using widely available open source tools.
  • Analyze big data problems using scalable machine learning algorithms on Spark.

e)  Graph Analytics for Big Data

Created by University of California, San Diego this course gives you a broad overview of the field of graph analytics so you can learn new ways to:

  • model,
  • store,
  • retrieve
  • and analyze graph-structured data

f)  Big Data – Capstone Project

In the Capstone Project, you will build a big data ecosystem using methods and tools from the earlier courses in this specialization.

You will analyze a data set simulating big data generated from a large number of users who are playing our imaginary game “Catch the Pink Flamingo”. During the five weeks Capstone Project, you will walk through the typical big data science steps for:

  • acquiring,
  • exploring,
  • preparing,
  • analyzing,
  • and reporting.

 

#8 Big Data for Data Engineers Specialization

Created by Yandex, in this specialization you will learn the basics of MapReduce, Hadoop, Spark, methods of real-time data processing, offline data processing for warehousing, and large-scale machine learning.

This course will master your skills in designing solutions for common Big Data tasks:

  • creating batch and real-time data processing pipelines,
  • doing machine learning at scale,
  • Deploying machine learning models into a production environment — and much more!

 

The courses that make up this Coursera Specialization are :

 

a) Big Data Essentials: HDFS, MapReduce and Spark RDD

In this 6-week course created by Yandex you will:

  • learn some of the modern Big Data landscape basic technologies, namely: MapReduce, HDFS and Spark;
  • learn about distributed file systems, what function they serve and why they exist;
  • grasp the MapReduce framework (a workhorse for many modern Big Data applications); and will solve sample business cases by applying the framework to process texts;
  • learn about Spark, (the next-generation computational framework); and will build a strong understanding of its basic concepts;
  • develop skills to apply these tools to creating solutions in social networks, finance, telecommunications and many other fields.

b) Big Data Analysis: Hive, Spark SQL, DataFrames and GraphFrames

This course created by Yandex will teach you how to:

  • Warehouse your data efficiently using Hive, Spark SQL and Spark DataFframes.
  • Work with large graphs, such as social graphs or networks.
  • Optimize your Spark applications for maximum performance.

Precisely, you will master your knowledge in:

  • Writing and executing Hive & Spark SQL queries;
  • Reasoning how the queries are translated into actual execution primitives (be it MapReduce jobs or Spark transformations);
  • Organizing your data in Hive to optimize disk space usage and execution times;
  • Constructing Spark DataFrames and using them to write ad-hoc analytical jobs easily;
  • Processing large graphs with Spark GraphFrames;
  • Debugging, profiling and optimizing Spark application performance.
  • Still in doubt? Check this out. Become a data ninja by taking this course!

c) Big Data Applications: Machine Learning at Scale

Machine learning is transforming the world around us. To become successful, you’d better know what kinds of problems can be solved with machine learning, and how they can be solved. Don’t know where to start? The answer is one button away.

During this course created by Yandex you will:

  • Identify practical problems which can be solved with machine learning
  • Build, tune and apply linear models with Spark MLLib
  • Understand methods of text processing
  • Fit decision trees and boost them with ensemble learning
  • Construct your own recommender system.

As a practical assignment, you will

  • build and apply linear models for classification and regression tasks;
  • learn how to work with texts;
  • automatically construct decision trees and improve their performance with ensemble learning;
  • Finally, you will build your own recommender system!

With these skills, you will be able to tackle many practical machine learning tasks.

We provide the tools; you choose the place of application to make this world of machines more intelligent.

d) Big Data Applications: Real-Time Streaming

There is a significant number of tasks when we need not just to process an enormous volume of data, but to process it as quickly as possible. Delays in tsunami prediction can cost people’s lives. Delays in traffic jam prediction cost extra time. Advertisements based on the recent users’ activity are ten times more popular.

However, stream processing techniques alone are not enough to create a complete real-time system. For example to create a recommendation system we need to have a storage that allows to store, and fetch data for a user with minimal latency. These databases should be able to:

  • store hundreds of terabytes of data,
  • handle billions of requests per day
  • and have a 100% uptime

NoSQL databases are commonly used to solve this challenging problem.

After you finish this course created by Yandex, you will master stream processing systems and NoSQL databases. You will also learn how to use such popular and powerful systems as Kafka, Cassandra and Redis.

e) Big Data Services: Capstone Project

Are you ready to close the loop on your Big Data skills? Do you want to apply all your knowledge you got from the previous courses in practice? Finally, in the Capstone project, you will integrate all the knowledge acquired earlier to build a real application leveraging the power of Big Data.

You will be given a task to combine data from different sources of different types (static distributed dataset, streaming data, SQL or NoSQL storage). Combined, this data will be used to build a predictive model for a financial market (as an example). First, you design a system from scratch and share it with your peers to get valuable feedback. Second, you can make it public, so get ready to receive the feedback from your service users. Real-world experience without any 3G-glasses or mock interviews.

 

#9 Data Visualization with Tableau Specialization

Created by University of California, Davis, in this Specialization, you will be able to generate powerful dashboards and reports that will help people, take action and make decisions based on their business data. You will use Tableau to create high-impact visualizations of common data analyses to help you see and understand your data. You will apply predicative analytics to improve business decision making. The Specialization culminates in a Capstone Project in which you will use sample data to create visualizations, dashboards, and data models to prepare a presentation to the executive leadership of a fictional company.

 

The courses that make up this Coursera Specialization are :

 

a) Fundamentals of Visualization with Tableau

Created by University of California, Davis in this first course of the specialization, you will discover just what data visualization is, and how we can use it to better see and understand data.

Using Tableau, we’ll examine the fundamental concepts of data visualization and explore the Tableau interface, identifying and applying the various tools Tableau has to offer. By the end of the course you will be able to prepare and import data into Tableau and explain the relationship between data analytics and data visualization.

This course is designed for the learner who has never used Tableau before, or who may need a refresher or want to explore Tableau in more depth.  No prior technical or analytical background is required.  The course will guide you through the steps necessary to create your first visualization story from the beginning based on data context, setting the stage for you to advance to the next course in the Specialization.

b) Essential Design Principles for Tableau

Created by University of California, Davis in this course, you will analyze and apply essential design principles to your Tableau visualizations. This course assumes you understand the tools within Tableau and have some knowledge of the fundamental concepts of data visualization.

In this course, you will define and examine the similarities and differences of exploratory and explanatory analysis, as well as begin to ask the right questions about what’s needed in visualization. You will assess how data and design work together, including how to choose the appropriate visual representation for your data, and the difference between effective and ineffective visuals. You will apply effective best practice design principles to your data visualizations, and be able to illustrate examples of strategic use of contrast to highlight important elements. You will evaluate pre-attentive attributes and why they are important in visualizations. You will examine the importance of using the “right” amount of color in the right place, and be able to apply design principles to de-clutter your data visualization.

c) Visual Analytics with Tableau

Created by University of California, Davis in this third course of the specialization, we’ll drill deeper into the tools Tableau offers in the areas of:

  • charting,
  • dates,
  • table calculations
  • and mapping

We’ll explore the best choices for charts, based on the type of data you are using. We’ll look at specific types of charts including scatter plots, Gantt charts, histograms, bullet charts and several others, and we’ll address charting guidelines. We’ll define discrete and continuous dates, and examine when to use each one to explain your data.  You’ll learn how to create custom and quick table calculations and how to create parameters. We’ll also introduce mapping and explore how Tableau can use different types of geographic data, how to connect to multiple data sources and how to create custom maps.

d) Creating Dashboards and Storytelling with Tableau

Leveraging the visualizations you created in the previous course, Visual Analytics with Tableau, you will create dashboards that help you identify the story within your data, and you will discover how to use Storypoints to create a powerful story to leave a lasting impression with your audience.

You will balance the goals of your stakeholders with the needs of your end-users, and be able to structure and organize your story for maximum impact. Throughout this course created by University of California, Davis you will apply more advanced functions within Tableau, such as hierarchies, actions and parameters to guide user interactions.  For your final project, you will create a compelling narrative to be delivered in a meeting, as a static report, or in an interactive display online.

e) Data Visualization with Tableau Project

In this project-based course, you will follow your own interests to create a portfolio worthy single-frame viz or multi-frame data story that will be shared on Tableau Public. You will use all the skills taught in this Specialization to complete this project step-by-step, with guidance from your instructors along the way. You will first create a project proposal to identify your goals for the project, including the question you wish to answer or explore with data. You will then find data that will provide the information you are seeking. You will then import that data into Tableau and prepare it for analysis.

Next you will create a dashboard that will allow you to explore the data in depth and identify meaningful insights. You will then give structure to your data story by writing the story arc in narrative form. Finally, you will consult your design checklist to craft the final viz or data story in Tableau. This is your opportunity to show the world what you’re capable of – so think big, and have confidence in your skills!

 

#10 Data Mining Specialization

Created by University of Illinois at Urbana-Champaign, the Data Mining Specialization teaches data mining techniques for both, unstructured data which exist in the form of natural language text, and structured data which conform to a clearly defined schema. Specific course topics include:

  • pattern discovery,
  • clustering,
  • text retrieval,
  • text mining
  • analytics, and data visualization

 

The courses that make up this Coursera Specialization are :

 

a) Data Visualization

Created by University of Illinois at Urbana-Champaign, in this first course you’ll learn the general concepts of data mining along with basic methodologies and applications.

Then dive into one subfield in data mining: pattern discovery. Learn in-depth methods, applications and concepts of pattern discovery in data mining. We will also introduce some interesting applications of pattern discovery and methods for pattern-based classification. This course provides you with the opportunity to learn content and skills to engage and practice in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of:

  • patterns,
  • sequential patterns,
  • and sub-graph patterns

b) Text Retrieval and Search Engines

This course created by University of Illinois at Urbana-Champaign will cover search engine technologies, which play an important role in any data mining applications.

c) Text Mining and Analytics

This course created by University of Illinois at Urbana-Champaign will cover the major techniques for analyzing and mining text data to extract useful knowledge, support decision making, and discover interesting patterns, with an emphasis on statistical approaches, that can be generally applied to arbitrary text data in any natural language with no or minimum human effort.

d)  Pattern Discovery in Data Mining

Created by University of Illinois at Urbana-Champaign, in this first course you’ll learn the general concepts of data mining along with basic methodologies and applications.

Then dive into one subfield in data mining: pattern discovery. Learn in-depth methods, applications and concepts of pattern discovery in data mining. We will also introduce some interesting applications of pattern discovery and methods for pattern-based classification. This course provides you with the opportunity to learn content and skills to engage and practice in scalable pattern discovery methods on massive transactional data, discuss pattern evaluation measures, and study methods for mining diverse kinds of:

patterns,

sequential patterns,

and sub-graph

 

e) Cluster Analysis in Data Mining

Created by University of Illinois at Urbana-Champaign, in this course you’ll discover the basic concepts of cluster analysis, and then study a set of typical clustering algorithms, applications, and methodologies. This includes hierarchical methods such as BIRCH, partitioning methods such as k-means, and density-based methods such as DBSCAN/OPTICS. Moreover, learn methods for clustering validation and evaluation of clustering quality. Finally, see examples of cluster analysis in applications.

f) Data Mining Project

This six-week long Project course of the Data Mining Specialization will allow you to apply the learned algorithms and techniques for data mining from the previous courses in the Specialization, including:

  • Pattern Discovery,
  • Clustering,
  • Text Retrieval,
  • Text Mining,
  • and Visualization

To solve interesting real-world data mining challenges. Specifically, you will work on a restaurant review data set from Yelp and use all the knowledge and skills you’ve learned from the previous courses to mine this data set to discover interesting and useful knowledge. The design of the Project emphasizes:

  • simulating the workflow of a data miner in a real job setting;
  • integrating different mining techniques covered in multiple individual courses;
  • experimenting with different ways to solve a problem to deepen your understanding of techniques;
  • allowing you to propose and explore your own ideas creatively.

 

#11 Python for Everybody Specialization

As you may already know, Python is one of the most widely used programming languages in the field of data Analysis/ science / engineering. That is why I have added this Python specialization here.

it is created by the University of Michigan, this Specialization introduces fundamental programming concepts including networked application program interfaces, data structures, and databases, using the Python programming language.

 

The courses that make up this Coursera Specialization are :

 

a)  Programming for Everybody (Getting Started with Python)

Created by University of Michigan, this course using Python, aims to teach everyone the basics of programming computers. We cover the basics of how one constructs a program from a series of simple instructions in Python. This course will cover Chapters 1-5 of the textbook “Python for Everybody”.

b)   Python Data Structures

Created by University of Michigan, the core data structures of the Python programming language will be introduced in this course. We will explore how we can use the Python built-in data structures such as dictionaries, tuples, and lists to perform increasingly complex data analysis. This course will cover Chapters 6-10 of the textbook “Python for Everybody”.

c)   Using Python to Access Web Data

Created by University of Michigan, this course will show how one can treat the Internet as a source of data.  We will parse, scrape, and read web data as well as access data using web APIs.  We will work with XML, HTML, and JSON data formats in Python.  This course will cover Chapters 11-13 of the textbook “Python for Everybody”.

d)   Using Databases with Python

This course will introduce students to the basics of the SQL, as well as basic database design for storing data as part of a multi-step data analysis, processing and gathering effort. We will also build web crawlers and visualization and multi-step data gathering processes.  We will use the D3.js library to do basic data visualization.  This course will cover Chapters 14-15 of the book “Python for Everybody”.

e)    Capstone: Retrieving, Processing, and Visualizing Data with Python

Students will build a series of applications to process, visualize and retrieve data using Python. In the first part, to become familiar with the technologies in use, students will do some visualization and then will pursue their own project to visualize some other data that they have or can find.  Chapters 15 and 16 from the book “Python for Everybody” will serve as the backbone for the capstone.

 

Online Degrees Related to data Science

 

Master of Computer Science

A graduate degree credential from the University of Illinois, Master of Computer Science builds knowledge and skills in advanced topics of computer science.

 

Master of Computer Science

A degree program from Arizona State University, Master of Computer Science provides high-quality computer science instruction along with real-world experience through applied projects. You’ll gain a deep understanding of cutting-edge topics like:

  • AI,
  • cyber security,
  • the block chain,
  • and big data

While you develop interpersonal skills that’ll help you succeed in any organization.

Master of Applied Data Science

Gain insight into data and develop hands-on skills in:

  • programming,
  • statistics,
  • data analysis,
  • information visualization,
  • and machine learning

The Master of Applied Data Science degree program from University of Michigan, provides project-based education for students from a large range of backgrounds, including the social sciences, professional schools or sciences.

Master of Computer Science in Data Science

A degree program from University of Illinois, Master of Computer Science in Data Science will provide you with access to the statistical and computational knowledge needed to turn big data into meaningful insights. Build expertise in four core areas of computer science:

  • data visualization,
  • machine learning,
  • data mining,
  • and cloud computing

While learning key skills in information science and statistics.

 

 

Comment Here