Our Python journey now takes us into Pandas DataFrames, with a native syntax very unlike SQL, especially as queries become more analytically complex. We will answer the following question, based on an included public list of employees and their jobs. From a list where one row indicates one employee, how many employee job titles in… Read More
Python’s NumPy library is fun in that it’s easy to work with multi-dimensional data. For simplicity, consider a 2D array (aka matrix). I wrote some code to demonstrate the creation, simple visualization, slicing, and aggregation of data within a matrix, including totals and slice-subtotals. Source Code: It is available in Git Hub: NumPy 2D Array… Read More
Although I don’t know whether OOP will be central to our exploration of NumPy, Pandas and other Python libraries for analytics, here is a simple example of what I find useful. I want to be able to perform any one of a set of related trigonometry expressions, and do so repeatedly without re-specifying the… Read More
Quick little geek-out here: Had some initial fun with Python string manipulations in order to detect a palindrome, defined here as a word or phrase (perhaps a very long phrase) spelled the same when reversed as when forward. Had to dig just a bit deeper to accommodate any blank spaces that would otherwise violate the… Read More
Data Models are a’changin! To learn about these changes, please join me Saturday, Oct 15, as I present “Lean Data Model Storming for Data Project Leaders” at the Southland Technology (SoTec) Conference 2016. To view my session abstract, click here. This premier event, underwritten by PMI, AITP, IIBA and QAI, will bring together hundreds of… Read More
I’ve just added a new entry to ‘Challenge – Solution – Impact’ based on a recent engagement. Click on the above title to have a look.
Without support from I.T., analysts increasingly need to perform data preparation tasks of varying complexity in order to wrangle data into shape for current analytic needs. Using Alteryx Designer, many such tasks are simple and intuitive. Let’s consider an example. For the completed Alteryx workflow sample published in Alteryz product documentation, assume that, due to… Read More
Greetings, Click here to read my Amazon review of Ralph Hughes’ book on Agile DW for leaders. Well written and timely, it offers innovative, not widely-known methods for expediting the delivery of priority insights and of keeping your data infrastructure lean and responsive to evolving business needs. Daniel
Another published slide deck, presented in Huntington Beach at the recent SQL Saturday Event. To view the deck, click here. Viewer comments are always welcome!
This post for data modelers, like a good portion of my online content, is in the form of published slides. To view them, click here. If you like them, feel free to subscribe here for the occasional new entry.