In the analytics department battle, I am pro-centralized
Part 1: Weaknesses in a decentralized system
When I initially structured my agency analytics department in 2017, I had no choice but to make it decentralized. I was given a large access to the state’s Cloudera Hadoop platform, modern ingestion tools, and several petabytes of space. I had no staff at the time, yet there was a pressing need to quickly ramp up operations. I am also not a full stack developer. My back-end skills at the time were that of a true circa-2009 analyst — writing SQL, pulling from source systems to local tables, and then using BI software to create visuals, reports, and analyses.
Commence scavenger mode. I pulled in some DBAs from the IT department, a couple key SMEs, technical staff from our major vendor, and the developers of the state BI platform. From there, it was a matter of drawing up schematics and hoping that departments could spare the time to approve connection requests, blow through firewalls, and help build the ingestion segments.
There was also plenty of begging involved: for time, for precious resources, and to get into the development pipeline — period. Once the jobs were done, it was a painfully slow matter of making change requests as they were needed. With an on-site centralized data engineer, there would have been much more responsiveness, but you fight with the army you have.
From there, I acquired and distributed nine Tableau licenses throughout the agency. Truthfully, maybe three or four of those licenses were in good hands. The other five were handed out in a manner that kept departments happy. In a centralized model, the analytics department staff would get the licenses, and they would have been used optimally.
In my attempts to be fiscally prudent, the licenses were issued with a warning that I’d be monitoring software usage levels. Employees who didn’t utilize their software would lose it, and the seat would be reassigned. I followed through with this threat by receiving periodic IT reports of the users’ Tableau software versions. Users with a software version more than a year old would get a nudge. Users who continued to avoid an update would get their seat pulled and reassigned. About a year in, I finally got a mostly-solid set of staff members with licenses.
But this was the worst part of the decentralized model: I was a trained analyst, not an engineer. Table transformations were farmed outside the organization and the final vision of curated data sets was not achieved. Decentralization meant playing office politics, which is just redistributing data misery.
Ultimately, I see decentralized analytics functions as a Step 1 in analytics maturity.
There are so many practical downsides to the decentralized model:
- So much harder to hire critical staff
- Nearly impossible to enforce minimum standards
- A Tableau license alone does not make a marketing specialist a data analyst
Rather, for the purposes of decidedly quashing office politics and establishing and executing a singular vision, it’s better to centralize. I’ll cover that topic in a future post.