seasonality and the newspaper distribution problem. using data visualization to improve trend line...

7
Abstract Given the persistent poor, uncertain economic performance in the print media industry, newspaper distributors are challenged to visualize and leverage data from their various distribution routes and business store points to identify the routes and distribution points along those routes that hold the potential for profitability. This paper analyzes the geographic overview and corresponding data from the prime distribution areas of a newspaper distribution company operating in the major urban corridors of southwestern Pennsylvania and north-central New Jersey regions. Trend line forecasts are then generated to predict sales performances in each area for specific newspaper products. A waterfall model further pinpoints the type of newspaper and product and the best distribution points in the company’s areas of responsibility. Index Terms—data visualization, newspaper distribution, sales trend line forecasting, seasonality I. INTRODUCTION Liberty News Distributors, Inc., founded in 2006, encompasses more than 5,000 national accounts and distributes more than 1,500 titles including domestic newspapers, periodicals and international magazines. Major distribution points include convenience Manuscript received January 10, 2017. This work was part of a business administration course for Saint Peter’s University, led by Joseph Gilkey, Ph.D., and in collaboration with Liberty News Distribution, Inc. A Ana Maria Garcia, Guen Pak, Francis Oduro, Deondre Thompson and Karla Erazo, authors, are students at Saint Peter’s University. All queries should be directed to [email protected] or [email protected] . stores, shopping plazas and airports along with more than 300 select independent and chain stores accounts via the FedEx Corp. Recognizing the need to identify sales opportunities and ongoing positive communication with retailers, distributors and publishers, the company tasked a research team of university students, led by a faculty member, to use data visualization tools (e.g., Tableau and Microsoft Excel) to finetune daily distribution operations that reflect consumer buying behavior of newspaper products and the distribution points at stores they patronize. This study focuses on distribution of four newspapers in the Pennsylvania-New Jersey market covered by the company: Delco Times, The New York Times, New York Daily Post and New York Daily News. II. LITERATURE REVIEW A. Distribution and Seasonality Newspapers have a short time-sensitive life. For national dailies, such as The New York Times and The Daily News, the respective value of each copy is zero the day following its publication. The lifecycle is rapid, as most readers prefer to receive the news before 9 a.m. or whatever time their workday begins, unlike with longer-form media products (e.g., novels, hardbacks, magazines, or other periodicals). These problems have been compounded by large-scale changes in commuting habits of consumers who must contend with heavy traffic volume, especially during the morning hours in major urban areas. The challenge existed long before Seasonality and The Newspaper Distribution Problem: Using Data Visualization to Improve Trend Line Forecasts Ana Maria Garcia, Guen Pak, Francis Oduro, Deondre Thompson, Karla Erazo, Joseph Gilkey Jr.

Upload: scott-guen-pak

Post on 08-Feb-2017

71 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Seasonality and The Newspaper Distribution Problem. Using Data Visualization to Improve Trend Line Forecasts ver 2

Abstract— Given the persistent poor, uncertain economic performance in the print media industry, newspaper distributors are challenged to visualize and leverage data from their various distribution routes and business store points to identify the routes and distribution points along those routes that hold the potential for profitability. This paper analyzes the geographic overview and corresponding data from the prime distribution areas of a newspaper distribution company operating in the major urban corridors of southwestern Pennsylvania and north-central New Jersey regions. Trend line forecasts are then generated to predict sales performances in each area for specific newspaper products. A waterfall model further pinpoints the type of newspaper and product and the best distribution points in the company’s areas of responsibility.

Index Terms—data visualization, newspaper distribution, sales trend line forecasting, seasonality

I. INTRODUCTION

Liberty News Distributors, Inc., founded in 2006, encompasses more than 5,000 national accounts and distributes more than 1,500 titles including domestic newspapers, periodicals and international magazines. Major distribution points include convenience stores, shopping plazas and airports along with more than 300 select independent and chain stores accounts via the FedEx Corp.

Recognizing the need to identify sales opportunities and ongoing positive communication with retailers, distributors and publishers, the company tasked a research team of university students, led by a faculty member, to use data visualization tools (e.g., Tableau and Microsoft Excel) to finetune daily distribution operations that reflect consumer buying behavior of newspaper products and the distribution points at stores they patronize.

This study focuses on distribution of four newspapers in the Pennsylvania-New Jersey market covered by the company: Delco Times, The New York Times, New York Daily Post and New York Daily News.

II.LITERATURE REVIEW

A. Distribution and SeasonalityNewspapers have a short time-sensitive life. For national

Manuscript received January 10, 2017. This work was part of a business administration course for Saint Peter’s University, led by Joseph Gilkey, Ph.D., and in collaboration with Liberty News Distribution, Inc. A

Ana Maria Garcia, Guen Pak, Francis Oduro, Deondre Thompson and Karla Erazo, authors, are students at Saint Peter’s University. All queries should be directed to [email protected] or [email protected].

dailies, such as The New York Times and The Daily News, the respective value of each copy is zero the day following its publication. The lifecycle is rapid, as most readers prefer to receive the news before 9 a.m. or whatever time their workday begins, unlike with longer-form media products (e.g., novels, hardbacks, magazines, or other periodicals). These problems have been compounded by large-scale changes in commuting habits of consumers who must contend with heavy traffic volume, especially during the morning hours in major urban areas.

The challenge existed long before technological advancements in information dissemination and communication began to affect print media’s profitability, as Fowler’s model of comparative readability of newspapers and novels has demonstrated (1904, 1933 and 1965). The newspaper distribution problem also has been compounded by the geography of distribution and allocation points, along with fleet routing problems. In this case, Liberty News Distribution has experienced routing problems, given the geographical distance of some locations that confound the economies of scale in distribution. In a 1996 study of a major U.S. metropolitan newspaper, cost savings were realized by reducing the number of distribution centers along with a corresponding decrease in truck fleets and drivers required to serve the distribution centers from the newspaper production facilities. Resolving the distribution component problem requires coordinating and overcoming the problems of uncertain demands to identify potentially profitable drop-off points for newspaper products in targeted areas that can be delivered within the shortest amount of time possible. The objective is to reduce undue costs and mitigate risks of sale losses that are aggravated by high levels of return and stocking costs and transportation expenses, especially on the least profitable segments of routes in the company’s areas of responsibility for distribution.

B. ReadershipNewspaper circulation and readership decline continues

a long-term trend. In 2015, the average weekday circulation fell seven percent, despite annual increase of 2 percent in digital subscriptions to newspapers. Sunday circulation during the same year declined by 4 percent, again despite a 4 percent increase in digital Sunday newspaper subscriptions (Barthel, 2016). The most recent declines occurred after a brief rebound in print subscriptions in 2013

Despite the declines, print circulation still accounts for the largest share of readership (78 percent, weekdays; 86 percent, Sunday) and one survey indicates that 59 percent of consumers who read newspapers still do so in print-only formats (Barthel, 2016).

III. DATA

Seasonality and The Newspaper Distribution Problem: Using Data Visualization to Improve Trend Line

Forecasts

Ana Maria Garcia, Guen Pak, Francis Oduro, Deondre Thompson, Karla Erazo, Joseph Gilkey Jr.

Page 2: Seasonality and The Newspaper Distribution Problem. Using Data Visualization to Improve Trend Line Forecasts ver 2

The company provided 12 months of aggregate data for the 2015-16 period so that the research team could prepare visualizations for analysis. The data include distribution locations for each route in all three areas designated for analysis, including street addresses. Sales data for each day of the week and the sales of individual newspaper editions are indicated for each location, along with prices, revenues and gross profit margins. The team accounted for aspects that might have hindered effective visualizations. One instance involved negative numbers in the data signifying a potential revenue loss due to not properly accounting for “Draws” from “Returns” to equaling “Sales.” Problems were resolved by segmenting data (i.e., querying identical records of data) into measurable pieces by establishing a coordinated hierarchy covering the four geographical areas of operations. For example, if a Philadelphia store sells papers every day of the week, but circulation data separate weekdays, Saturdays and Sundays (Fig. 1). To make the analysis more efficient and comprehensible, data records were formatted to chart weekly sales, as opposed to day-to-day sales (Fig. 2). A script was created to automate the process for the entire raw data set (https://github.com/gpak/SchoolprojectOR/blob/master/Combining%20Cells). This procedure facilitated ease and efficiency in data visualization and storage.

Fig. 1. The raw data as presented by the company.

Fig. 2. The condensed data after file manipulation.

The research team identified four areas to further organize the data set: Area 1 (greater Philadelphia and Delaware County, Pennsylvania), Area 2 (north and central New Jersey), Area 3 (central and coastal New Jersey) and Area 4 (newly acquired distribution outlets lacking in sufficient data timespan set aside temporarily to be considered in follow-up research).

IV. RESULTSOne example of a linear trend model (0.0878696*Week

of W/E+4379.1) was computed for The New York Times in Area 1, with Measure Values given W/E Week (R2=0.818676; F14, 406 = 140.622; p<0.001).

In the data visualization tool, the user has the option to choose from any day of the week to highlight data (e.g., compare each Sunday to entire data set to capture trends and seasonality. Data for example, in the Sunday measures indicated solid sales in Area 1 (t=15.3098, p<0.001). Sunday sales in the company’s distribution points for Area 1 had increased 55 percent from October 2015 to October 2016, primarily because of a corresponding increase in the number of distribution points the company was serving in the area.

Model forecasting relied on the three parameters of exponential smoothing (levels, trends, and seasonality) to identify best-case scenarios (Table 1). Seasonality emerged as a strong predictor.

TABLE 1. MODEL FORECASTING WITH EXPONENTIAL SMOOTHING.

Sum of Mon-Sale

Quality Metrics Smoothing Coefficients

RMSE1 MAE2 MASE3 MAPE4 AIC5 Alpha Beta Gamma30 23 0.66 3.6% 436 0.291 0.000 0.089

Sum of Tue-Sale

Quality Metrics Smoothing Coefficients

RMSE MAE MASE MAPE AIC Alpha Beta Gamma34 25 0.80 4,1% 452 0.193 0.000 0.063

Sum of Wed-Sale

Quality Metrics Smoothing Coefficients

RMSE MAE MASE MAPE AIC Alpha Beta Gamma30 24 0.76 3.8% 438 0.005 0.052 0.003

Sum of Thu-Sale

Quality Metrics Smoothing Coefficients

RMSE MAE MASE MAPE AIC Alpha Beta Gamma33 27 0.79 4.2% 449 0.000 0.000 0.029

Sum of Fri-Sale

Quality Metrics Smoothing Coefficients

RMSE MAE MASE MAPE AIC Alpha Beta Gamma59 44 0.81 6.2% 517 0.040 0.469 0.000

Sum of Sat-Sale

Quality Metrics Smoothing Coefficients

RMSE MAE MASE MAPE AIC Alpha Beta Gamma39 27 1.31 4.5% 467 0.361 0.000 0.069

Sum of Sun-Sale

Quality Metrics Smoothing Coefficients

RMSE MAE MASE MAPE AIC Alpha Beta Gamma116 82 1.27 22.5% 597 0.234 0.000 0.159

An additional element of data visualization geographic overviews of each of the three areas covered in newspaper distribution and the types of media product that would be most conducive to profitable sales in the respective areas. For instance, Area 1 signaled sales support for newspapers focusing on news about Delaware County, Pennsylvania and the greater Philadelphia area, while sales of The New York Times editions were most evident in Area 2, and less so in Area 3.

1 Root-Mean-Square-Error (RMSE): is used to calculate the amount of error there is between the predicted and observed values

2 Mean-Absolute-Error (MAE): is the measure of how close the forecast or predictions are to the actual outcomes

3 Mean-Absolute-Scale Error (MASE): the measure of the accuracy of forecasts

4 Mean- Absolute-Percentage-Error (MAPE) is the measure of prediction accuracy of a forecasting method

5 Akaike Information Criterion (AIC): is the measure of the relative quality of a statistical model

Page 3: Seasonality and The Newspaper Distribution Problem. Using Data Visualization to Improve Trend Line Forecasts ver 2

Fig. 3. Area 1: geographic overview.

Fig. 4. Area 2: geographic overview.

Fig. 5. Area 3: geographic overview.

To handle negative numbers in the original data set, the researchers segregated the data into a separate set in order to highlight poor performing distribution locations that might need further attention. The analysis generated a waterfall model to track losses over time. For example, one convenience store location received 10 copies of a particular newspaper daily. The store, on average, sold six copies per day and returned the remainder to the company for a refund to be disbursed at the end of each week, which would offset revenues over time. The model highlighted this location as a candidate for revising distribution. In the model, Areas 1 and 3 appeared to have the highest number of locations that would have a persistently negative impact on revenue, whether because of poor sales or by client business owners who might try to exploit the refund transactions. As observed in the following graph figures (Fig. 5, Fig. 6 and Fig. 7), the problem of loss-prone distribution locations is not as prominent in Area 1 as it is in Area 3. The waterfall model output for Area 1 indicates a slight decline and stagnating position but in Area 3, there

is an across-the-board decline across the region’s zip codes. This sheds light on underlying significant revenue losses that likely had not been tracked as closely as they should have been. The model output for Area 2 suggests the area is a solid revenue performer and the results suggest that the company should address its most serious distribution problems first in Area 3.

Fig. 6. Area 1: waterfall model.

Fig. 7. Area 2: waterfall model.

Fig. 8. Area 3: waterfall model.

Researchers also generated histograms for each area and for each newspaper publication to highlight specific distribution locations (such as stores for two popular convenience chains that are open 24/7) where the potential for selling more copies of specific newspapers would be the most promising (e.g., those that consistently showed no copies of papers being returned for refunds). For example, in an Area 1 convenience store that is always open, the model predicted the location could sell more copies than what already is being sold. The data also alert the company’s distribution management to learn more about

Page 4: Seasonality and The Newspaper Distribution Problem. Using Data Visualization to Improve Trend Line Forecasts ver 2

the factors behind specific locations that tend to perform well consistently.

Figure 9. Day-specific analysis of one newspaper’s sales for chains of convenience stores.

For example, many of the best-performing locations in Area 1 are based in neighborhoods that share demographics with segments who have been identified as loyal print media consumers (e.g., age groups that spent their formative years with media before the age of Internet and digital media). In addition, other locations near concentrations of schools and that are in lower- and middle-income areas also are ideal candidates for improved sales. Media access via traditional formats still matters especially in lower-income families where children might need to rely on media for school assignments and might not always have the opportunity to visit a local library branch.

In addition, data visualization can pinpoint dates when sales increased significantly, as consumers searched for information about major breaking news and events. While no one can anticipate when a major story might break, data visualization tools can augment a distributor’s capacity to respond quickly to sudden surges in demands for newspaper editions that focus on such breaking news. Consumers might patronize locations in such instances, knowing that a [articular convenience store, for example, will have sufficient numbers of newspaper copies for purchase. Researchers also found that numerous locations only returned one copy of a particular newspaper during the week on numerous occasions. Some anecdotal evidence indicated that the sole remaining copy was held back as a courtesy for customers who at least had the opportunity to scan the newspaper if there were no remaining copies available for sales. That location also might be an ideal candidate for distributing additional copies for sale.

V. CONCLUSION

The challenges for a company involved in distributing print media products are immense, as the newspaper industry continues to consolidate and shrink. However, survey data indicate legacy print media such as newspapers remains an important consumer product. National readership data culled from Nielsen Scarborough’s 2015 Newspaper Penetration Report show that 51 percent of those who consume a newspaper read it exclusively in print, even as it is down from 62 percent print-only readership in 2011 and 59 percent in 2012. Print still is a

predominant choice for many consumers but the inevitable decline will continue. As for newspaper distribution companies, such as the one profiled in this study, the question turns to identifying opportunities to leverage the optimal performance of specific distribution points on various route points to create a longer period of profitability, as consolidation continues. The opportunity to extend a positive trend line forecast even for the short term could provide the business some critical flexibility as it considers how its operations will continue to be affected by the longer-term shrinking of the newspaper industry. Undoubtedly, while the forecast points to difficult times ahead, a business such as a newspaper distribution outlet can remain agile and even profitable by becoming creative in its response to trends and to target those areas where both positive and negative challenges present themselves.

Even as some enterprising newspapers look to digital media products including video and podcasting and niche websites as options for financial viability, others are creating new print products including print magazines that focus on categories such as food and drink and outdoor recreation as well as high-quality lifestyle publications featuring long-form journalism. Others are joining competitors to launch expanded Sunday editions, while some newspapers have launched specific neighborhood editions tailored to individual communities. Yet others are launching premium print editions that incorporate content that typically was available exclusively on digital and web platforms.

The quality of relationships with individual distribution locations will become more critical, as client stores consider the potential of carrying alternative print media products launched by newspaper publishers. In the interim, companies can enjoy some breathing space by finetuning their distribution models per these types of data visualizations. This amplifies the purpose of remaining financially stable so that these new opportunities are not lost, as they become more frequent in a rapidly changing newspaper industry.

REFERENCES

[1] Barthel, M. (2016, June 15). Newspapers: Fact Sheet | Pew Research Center. Retrieved from http://www.journalism.org/2016/06/15/newspapers-fact-sheet/

[2] Fowler Jr., G. L. (1978). The Comparative Readability of Newspapers and Novels. Journalism Quarterly, 55(3), 589-592.

[3] Hurter, A. P., & Van Buer, M. G. (1996). The Newspaper Production/Distribution Problem. Journal of Business Logistics, 17(1), p. 85.

[4] İncesu, G., Aşıkgil, B., & Tez, M. (2012). Sales Forecasting System for Newspaper Distribution Companies in Turkey. Pakistan Journal of Statistics and Operation Research, 8(3), pp.685-699. doi:10.18187/pjsor.v8i3.539

[5] Lucena, A. A. (2011). The print newspaper in the information age. Retrieved from http://www.media-ecology.org/publications/MEA_proceedings/v12/9_print.pdf-