Purchasing strategies are a significant contributing factor to the overall performance of a firm. Not only has it been argued that transitioning purchasing from a tactical to a strategic role provides sustainable competitive advantage, compelling evidence supports the relationship between strategic purchasing and supplier integration, performance, power, and total cost of ownership among other benefits. Consequently, it has become widely accepted that establishing a strategic purchasing approach is imperative for today’s firm.
Back in March I wrote a post about writing a book on R. I was fortunate enough to have this book picked up by Springer for the Use R! series. Data Wrangling with R was just published and is now available to purchase through Springer or on Amazon. This book will guide you through the data wrangling process along with give you a solid foundation of working with data in R. For more details on what this book covers see the description here.
Learning curves are steep in history and have several alternate names such as improvement curves, progress curves, startup functions, and efficiency curves. The “learning effect” was first noted in the 1920s in connection with aircraft production. Its use was amplified by experience in connection with aircraft production in WWII. Initially, it was thought to be solely due to the learning of the workers as they repeated their tasks. Later, it was observed that other factors probably entered in, such as improved tools and working conditions, and various management initiatives. Regardless of the exact, or more likely combined, phenomenon we can group these factors together under the general heading of “learning.”
Per capita income and expenditures provide crucial insight into the average standard of living in specified areas. Disposable per capita income measures the average income earned after taxes per person in a given area (city, state, country, etc.) in a specified year. It is calculated by dividing the area’s total income after tax by its total population. Per capita expenditures, on the other hand, measures the average outlay for goods and services by person and provides insight into spending patterns across a given area. Together, the assessment of per capita income versus expenditures can provide better understanding of regional economies, differences in standard of living, and approximate savings rates.
Correlation provides a good (initial) indication of association; however, people often throw correlation values around without considering their significance. Although there is debate regarding what levels of correlation align with the strength of correlation (i.e. strong, moderate, and weak), we should also be aware that sample size is an influencing factor for whether a correlation is statistically significant or not. Just this week at work, I had a conversation that highlighted this.
Last week the Bureau of Labor Statistics (BLS) released the most recent unemployment statistics for April. No surprises in the umployment rate were experienced with the rate holding steady at about 5% over the past eight months and down from 5.4% in April of 2015. Each month the BLS reports these figures and each month the media spends a fair amount of time debating the relevance of the changes experienced and what these changes may indicate about our economy. It’s amazing how such a “simple” statistic can cause so much national debate and can be a central actor used by decision-makers to guide future policy decisions. However, coming up with this statistic is far from “simple”.1 Moreover, few American’s understand just what makes up the components of this macro-economic indicator. This post, hopefully, will help shed some light on this latter issue.
Americans believe “in the green light, the orgastic future that year by year recedes before us. It eludes us then, but that’s no matter — tomorrow we will run faster, stretch out our arms farther…So we beat on, boats against the current, borne back ceaselessly into the past.”1 Like Gatsby, American’s never lose their optimism for an ever-brighter future; and our vehicle of choice to get to this destination is technology. But should we expect technology to move us to a future that transcends our past?
"We wanted flying cars, instead we got 140 characters."
Whether its industry or government agencies indirect/support activities tend to get the short end of the stick with regards to analytic rigor. Much of my dissertation research focused on injecting more analytic rigor to better understand the economics of, and policy impacts, to these activities. This was one of my papers that assessed the potential application of bayesian networks to provide decision support.
So I decided to write a book. Data Wrangling with R! will help you learn the essentials of preprocessing data leveraging the R programming language to easily and quickly turn noisy data into usable pieces of information. Data wrangling, which is also commonly referred to as data munging, transformation, manipulation, janitor work, etc. can be a painstakenly laborious process. In fact, its been stated that up to 80% of data analysis is spent on the process of cleaning and preparing data. However, being a prerequisite to the rest of the data analysis workflow (visualization, analysis, reporting), it’s essential that you become fluent and efficient in data wrangling techniques.
A recent opinion article in ORMS Today asked if open-source statistical software is really free. The authors’ employment of truth and sketch of hidden costs suggest their answer is no. In a sense, I agree. Economic thinking acknowledges there is no free lunch. However, when it comes to the relative risks and merits of software, much remains to be discussed. Despite their titular assertion, the authors did not present “the true cost of ‘free’ statistical software.” Rather, they provided an alternative conceptualization of costs, particularly those of the R programming software. In response, a co-author and I presented a critique of that conceptualization and rebut their seven main claims. Taking a post-modern turn, we leave any notions of truth to the consumer.
In the epic poem Rime of the Ancient Mariner, Samuel Taylor Coleridge states, “Water, water, everywhere, nor any a drop to drink.” Indeed, some would say the same about data. Data appear to be everywhere yet only a fraction are analyzed. There are several arguments as to why but one that has reached the concern of the White House is data accessibility. However, this is rapidly changing as growth in technology and resources are quickly opening the doors of many data vaults to the masses. We, the public minions, now have access to a wide range of data; from social, financial, government, and ecommerce data to geospatial, search engine, and even ant data. We just need to know how to get it. Enter APIs.
With my previous two blog posts I implicitly started a series that covers common web scraping capabilities offered by R. In my first post I covered how to import tabular (i.e. .txt, .csv) or Excel files that are hosted online and in my last post I covered text scraping. In this post I cover how to scrape data from another common structure of data storage on the Web - HTML tables.
Vast amount of information exists across the interminable webpages that exist online. Much of this information are considered “unstructured” texts since they don’t come in a neatly packaged speadsheet. Fortunately, HTML websites are organized documents which means these texts are actually structured within underlying HTML code elements…we just need to figure out how to extract it! This post covers the basics of scraping text from online sources.
Last Tuesday October 14, Walmart’s stock price dropped by 10.04% due to their dim outlook on next year’s predicted profits. This drop was, reportedly, one of the biggest single-day declines in Walmarts history leaving the closing adjusted stock price at $60.03; the lowest its been since May 23, 2012.