Data Wrangling, Analysis and AB Testing with SQL
UC Davis offered this class through Coursera. The course description reads:
This course allows you to apply the SQL skills taught in “SQL for Data Science” to four increasingly complex and authentic data science inquiry case studies. We’ll learn how to convert timestamps of all types to common formats and perform date/time calculations. We’ll select and perform the optimal JOIN for a data science inquiry and clean data within an analysis dataset by deduping, running quality checks, backfilling, and handling nulls.
We’ll learn how to segment and analyze data per segment using windowing functions and use case statements to execute conditional logic to address a data science inquiry. We’ll also describe how to convert a query into a scheduled job and how to insert data into a date partition. Finally, given a predictive analysis need, we’ll engineer a feature from raw data using the tools and skills we’ve built over the course. The real-world application of these skills will give you the framework for performing the analysis of an AB test.
https://www.coursera.org/learn/data-wrangling-analysis-abtesting/home/info
The class definitely promises to cover some key concepts to data analytics – you’ll find in most data analytics job openings that AB testing is a critical skill to acquiring jobs in the field. Unfortunately, this might be the absolute worst course I have ever taken, whether in my actual proper bachelors and masters coursework or MOOCs. It’s THAT bad. Read on for more info…
Upsides:
- As noted above, you need to learn how to do AB Testing if you want to be a good data analyst. Hell, it’s important to being a marginal business analyst also. Basically, the gist is that you develop two alternative versions of reality and test them against each other to come up with the preferable model. Think of it as hypothesis testing. Datatron has a solid description of the concept here.
- Likewise, data wrangling is a key skill in making data usable for analysis purposes.
Downsides:
Jeez, where to begin…
- There is quite literally no sense in exactly what you’re doing. Unlike the previous course in the specialization, you don’t really every get an idea about what you’re supposed to accomplish. Don’t expect any direction whatsoever — in most cases, you can use the Coursera forums to get some help, but most other users are just as lost at figuring out what to do next.
- The homework is completely obtuse, and you will find yourself lying, cheating and stealing just to get this course over with. Seriously, I recommend that you just find someone else to review your assignments and do likewise for the other person.
- The videos aren’t edited. The presenter fumbles around in real-time. Yes, I get that there’s some value to seeing how a real person’s typos can mess up code, but there’s no value in wasting time watching silly typographical errors drastically extend the length of these videos.
- This course is paired with some other important ones in this specialization and completely drags it down. I credit the first course with making me very comfortable with SQL and the third course with working with SQL in notebooks with Databricks. I was committed to finishing this specialization and getting a certificate — if I wasn’t as committed, this class would cause me to just drop this program.
Such a shame that this course is as bad as it is…
If you’re committed to getting the Learn SQL Basics for Data Science specialization, just prepare for a bad couple of weeks on this class. Push through and try to make this as painless as possible.
If you’re not in love with this UC Davis certificate, though, run for the hills. Please find something else.
1 Response
[…] to grapple with the awful Data Wrangling and AB Testing with SQL. I’ve reviewed that atrocity here. I’d argue that you really don’t need the specialization; this course and the one on Spark SQL […]