0% found this document useful (0 votes)
27 views15 pages

Cricket Analysis

Uploaded by

arnoldted7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views15 pages

Cricket Analysis

Uploaded by

arnoldted7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 15

CRICKET VENUES AND PLAYERS PERFOMANCE ANALYSIS

Name:

Student Number:

Applied session number:

Teacher Associate:

Date:
INTRODUCTION

This report will create a narrative visualization to drive deep into the performances of cricket

players and statistics of venues. The target is cricket analysts, coaches, and enthusiasts who need

deep insights into the metrics of players and trends of venues to formulate strategies for playing

the game and enriching knowledge.

On the basic level, these visualizations should shed elaborate and engaging light on various

aspects of cricket. These will include the main performances of players, metrics such as batting

average, strike rates, economy rates for bowling, fielding statistics, and so on. These metrics will

be highlighted and represented intuitively such that the user can get an overview of one glance of

the strengths and weaknesses of a player. Moreover, comparative analyses will let users realize

how players stand relative to one another and how they do under different conditions.

Another critical component of this kind of visualization would be the venue statistics. It would

unearth trends about different cricket grounds, winning margins, pitch behavior, and scoring

patterns. This information is essential for coaches and analysts because it helps devise strategies

concerning a particular venue by picking up the right playing XI or deciding on the game plan

based on historical data.

The visualization will further go down to ball-by-ball match dynamics, providing a granular view

regarding how games unfold. More specifically, this information is helpful to the fan and the

analyst who wants to break down the flow of a match, understand the turning points, and

recognize patterns in play. Hence, with such fine-grained explanations, the visualization shall

help make informed strategic decisions on field placements, batting orders, and bowling changes.

DESIGN PROCESS
The following design process of the narrative visualization project on cricket player and venue

performance is considered a host of iterative steps in creating informative and engaging

visualizations for cricket analysts, coaches, and enthusiasts. In this section, an in-depth

description and justification of these elements regarding design choices made within the process

are discussed concerning five design sheets included in the Appendix.

i) Design sheet 1: Initial brainstorm

This was important during the initial phase of brainstorming for setting the project's direction,

which revolved around the Painstaking process of filtering, categorization, and combination of

available datasets toward establishing a cohesive and comprehensive dataset for analysis. There

were three primary datasets presented: fielding performance data, Data A; match data, DataB;

and venue statistics, Data C. Each dataset had vital information but was not harmonized for

integral usage.

Data cleaning and merging

The first stage of this was cleaning each dataset. Cleaning the data, checked and editing

inconsistencies in data, dealing with missing values, and format standardization. For instance,

there was a need to have player names and venue names standardized for consistency in different

datasets. This made sure that during merging, this process would be perfect. Critical Identifiers

relating these files were Player Names, Match IDs, and Venue names. By matching these

identifiers, we were able to harmonize the datasets for effective combination while retaining the

integrity and granularity of the data.

Justification for dataset choice


The Selection of Data A, DataB, and Data C was purposeful because they directly relate to

cricket performance metrics. Data A provided insights into fielding performance, capturing data

for catches, run-outs, fielding efficiency, etc. DataB offered data on the matches, what happened

in every game, which covered scores, player statistics, match outcomes, and conditions. Data C

provided data on venue statistics involving pitch conditions, historical averages, and win-loss

records at various venues.

Audience consideration

This was considered at the forefront in understanding the needs of the target audience analysts,

coaches, and cricket enthusiasts. Analysts and coaches require meticulous, accurate metrics for

most proper assessment of a player's performance or strategizing purposes for the team.

ii) Design sheet 2: Player performance analysis

Design Test Sheet 2 of the project focused on playing performance, with detailed analysis across

different formats of cricket, namely ODI, T20, and Test. The test was primarily aimed at

recognizing high-scoring players about parameters related to matches played, innings,

dismissals, and catches.

Design choices – Justification

 Visualization Choice: Bar Chart: A bar chart was selected to be the preferred

visualization of metrics for comparison across both formats. A bar chart is very effective

at representing categorical data, like player names, along with related metrics on matches

played, innings, dismissals, and catches. This, therefore, will allow the user to make

quick comparisons between players within any format and across formats, so top

performance and trends can be identified easily.


 Color Palette: Steel blue: A steel blue color scheme was used consciously for several

ends. First, steel-blue was chosen for aesthetic appeal and readability reasons to ensure

that the charts would be colorful yet professional in appearance. This application also

imbues the project with coherence and unity from one kind of visualization to another,

thus helping the user not get tired or distracted by various colors when moving between

the sheets. Moreover, steel blue is a natural color that would contrast nicely with the

white backgrounds often used in data visualizations, improving readability while being

viewed on screens or in print.

 Impact and user experience: By applying a bar chart format and a consistent color

scheme like steel blue, Design Sheet 2 presents complex information more accessible to

digest for users.

iii) Design sheet 3 – Venue performance analysis

This sheet was concerned with understanding how venue characteristics impact the outcome of

matches, looking at the distribution of the winning margin at the top five venues. It had

interactive features that included histogram bins so that users could explore on their own the

distribution of the winning margin.

Justification of design choices:

 Interactive Histogram Bins: Interactive histogram bins were implemented to give users

flexibility in data analysis. The histogram bins enable the winning margins to be

aggregated into discrete ranges, such as 0-10 runs, 11-20 runs, and so on. Making the bins

adjustable means users can dynamically change the amount of granularity in the

displayed data.
 Visual Encoding: Histogram Since the variables were winning margins, I chose the

histogram as the method of visual encoding since it is most in line with portraying the

distribution and frequency. This will give tremendously clear information on how the

range of the winning margins is distributed at each venue.

Impact and user experience

Design Sheet 3 optimizes user experience by combining interactive elements with a visually

informative representation of data. The bins of an interactive histogram are available for reveals

regarding the winning margins, which cost a continuous space regarding parameters of interest to

different users. This will keep users more engaged and offer more details on how venue factors

are causative of specific cricket match outcomes.

iv) Design sheet 4: Match dynamics analysis

Sheet 4 was dedicated to illustrating ball-by-ball analysis for insight into the dynamics of runs

scored during matches. A user could select specific batting teams for their performance over

different phases of the game.

Justification

 Visual variable choice: A line plot was used to show runs scored against time and balls.

Different colored lines represent different teams. It would allow the user to identify

patterns in the matches and precisely at which moments things changed.

 Typographical layout: The readability and understanding of the ball-by-ball analysis are

guaranteed using transparent labels and axis titles.

v) Design sheet 5: Full player and venue analysis


Sheet 5 provided integrated insights from Sheets 2-4, giving a comprehensive analysis of player

performance and venue performance. It is capable of dynamic filtering by player, team, or venue

against any period; it therefore allows deep dives into specific aspects of cricket performance

metrics.

Justification

 Narration Style: The comprehensive sheet adopted a dashboard layout with multiple

linked visualizations, enhancing interactivity and user engagement.

 Audience Engagement: Cricket fans and experts need rich insights into players' and

venues' performances, guiding the choice of interactive filters and visualizations.

Theoretical Underpinnings

In that respect, principles along Munzner's what-why-how framework were applied throughout

the design process:

 What (Data Abstraction): Ensuring that data chosen and visualized is relevant and

meaningful for cricket analysis.

 Why (Task Abstraction): Justifying the choice of visual encodings by tasks that the users

must perform, such as comparing players, analyzing venues, and understanding the

dynamics of the match.

 How (Algorithm/Interaction): Build in the interactivity that would let the cricket

performance metrics be explored and analyzed with features such as filtering or dynamic

data selection.

IMPLEMENTATION
Technical implementation

Narrative visualization for cricket player and venue performance was implemented in R within

the Shiny framework with some necessary add-on packages, which included Shinydashboard,

Tidyverse, GGthemes, and Janitor. Shiny allowed for an interactive web application, while

Shinydashboard provided a structured and appealing layout. Tidyverse took care of the

wrangling and visualizing of the data; GGthemes added visual appeal, and Janitor ensured the

data had consistent formatting.

There were a lot of inconsistencies and missing values in these raw datasets, so data cleaning and

transformation were an essential process in this analysis. Merging by key identifiers with dplyr

allowed the data to be incorporated accurately and meaningfully. ggplot2 was used for

visualizations, and dplyr added dynamic aspects with Shiny's reactive framework; it filtered by

the data, adjusted the bins in the histograms, and selected specific teams or players.

These interactive features had to be implemented by controlling reactive expressions and UI

elements in Shiny. Some aspects had to be simplified for better loading times and user

experience. In addition, some of the metrics, as conceived in design, have been modified due to

the constraints regarding available data. All these integrated R libraries were effectively used to

deliver this technical work, resulting in a compelling cricket analytics dashboard. With laboring

issues around data wrangling and interactivity evident in the delivery, the final product seems to

strike this delicate balance of performance, usability, and visual appeal. Ultimately, it provides a

rigid platform for its users to explore cricket performance data in deep detail.

Interactive narrative visualization implementation


The final implemented submission for this cricket analytical web-based dashboard includes five

custom-designed interactive visualizations to engender unique insights into cricket performance

data. The dashboard is constructed based on the Shiny framework in R, allowing users to

dynamically and interactively explore the data so that complex data sets are more accessible and

understandably presented to an audience. Anchored below is a detailed description of each

visualization and its purpose, along with how it conveys insights to the reader.

Visualization 1: Top 10 player performance overview

This display provides an overview of the top 10 cricket players by key performance indicators

such as matches, innings, dismissals, and catches. For any of these metrics, users can choose

from a dropdown selection to change the bar chart. Bars are colored steel blue for clarity and

visual consistency. This being an interactive chart, users can zero in on several aspects of player

performance that facilitate comparisons and really bring out the cream of players in various

categories.

Figure 1: Top 10 player performance overview

Visualization 2: Distribution of winning margins by runs at top 5 venues


This histogram represents the distribution of the winning margins by runs at the top 5 cricket

venues. A slider enables the user to change the number of bins; the histogram changes

dynamically to show either more or less fine-grained distributions. In this way, it gives

information about the trend of the outcomes of the matches and how the winning margins vary

across different venues. This plot can be designed so that significant trends and outliers would be

highlighted and give a precise visual feel to the results of the match.

Figure 2: Distribution of winning margins by runs at top 5 venues

Visualization 3: Ball-by-ball runs analysis

The third is the line plot of ball-by-ball runs for a user-selected batting team. There is a drop-

down menu where users can select the team of their choice, and the plot changes, accordingly,

showing runs scored during a match. In this line plot, at the same time, the teams are represented

by different colors, which makes it very easy to distinguish them. This allows a user to easily

understand the game flow, know the critical moments of the play, and analyze team performance

in a very fine-grained fashion.


Figure 3: Ball-by-ball runs analysis

Visualization 4: Team performance comparison by venue

This is a grouped boxplot of team performance across different venues. It contains all the venues

in the data set except those with missing values. The user can filter this to consider any venue or

team, thereby distinctly highlighting how each team is doing in various locations. Inbox plot

format, it's easier for a user to compare spreads and the central tendency of team performances,

bringing out the consistent performers and probable advantages or disadvantages of a venue.

Figure 4: Team performance comparison by venue

Visualization 5: Total runs scored vs. total runs conceded by venue


This last visualization contrasts the total runs scored with the total runs conceded for each team

at different venues. The user has the option to filter the data to include players or not include

players, which makes it more personalized in their analysis. Trend lines in this scatter plot show

overall patterns and correlations, allowing users to understand how offense and defense are

related. This plot gives more of an overview of the dynamics of the teams and their balancing

performance.

Figure 5: Total runs scored vs. total runs conceded by venue

Using the implementation

The following is a cricket analytics dashboard. To run this, ensure that R and RStudio are

installed in your system, along with the required libraries: Shiny, shinydashboard, tidyverse,

ggthemes, and janior. Now, place these CSV files in the correct file paths, as mentioned in the

script. Open this R script in RStudio and run the App once to open it in a browser.

There are five tabs, each containing different visualizations. The first tab is a "Top 10 Player

Performance Overview," which represents an interaction-bar chart that enables users to select a

few metrics from the dropdown menu and then compare player performance. On the second tab,
"Distribution of Winning Margins by Runs," authors have used an interactive slider on the

number of bins for a histogram to effectively see details about how the winning margins are

distributed.

The third tab, "Ball-by-Ball Runs Analysis," includes a line plot of user-selected batting teams by

runs per ball. A dropdown menu enables users to switch between teams to observe the progress

of various matches. On the fourth tab, there is a grouped boxplot comparing team performances

at other venues titled "Team Performance Comparison by Venue." Using another dropdown

menu, users can filter by the available venues, which will ensure that only relevant data is

shown.

The fifth tab is "Total Runs Scored vs. Total Runs Conceded," containing a scatter plot of the

teams' runs scored vs. runs conceded, along with a drop-down filter for players to create

personalized views. Interactive features include drop-down menus used to filter data in each

visualization 1, 3, 4, and 5; a slider, which is used in the case of visualization 2 for adjusting

bins; and tooltips over data points for extra details. The proper color coding and legends permit

the user to interpret data better and, therefore, set the dashboard in its place as an essential tool

for cricket analysts and enthusiasts.

CONCLUSION

The Cricket Analytics Dashboard is an impactful visualization of the analysis across various

facets of cricket data, such as player performance, winning margins, ball-by-ball runs, and team

performance across multiple venues, with a comparative study between runs scored and

conceded. Filters are provided for getting into the details of the various visualizations on the

dashboard for filtering data in real time to better understand statistics related to cricket.
It was a consolidation of a couple of data sets, cleaning and transformation of data for

visualization, and setting up interactivity in Shiny. In the process, I realized that user-centered

design sets visualizations to be not just informative but intuitive and engaging toward the

audience. This project has taught me how proper principles of data visualization—adequate use

of color, type chart selection, and meaningful labeling to promote understanding—are very

effective.

One area of improvement would be fine-tuning the data load and process timings, especially with

large datasets. More effective ways of handling data or preprocessing them before loading them

into the application would pay dividends in performance. One can go a step further and provide

advanced analytics features within it, such as predictive modeling or clustering, to gain further

valuable insights from them.

In the future, it could also provide live feeds onto the dashboard for the real-time analysis of

currently ongoing matches. One could also do predictive analysis by including machine learning

algorithms predicting the outcome of matches or player performances from within the dashboard.

This project has laid a powerful interaction and insight foundation into data visualization in the

sports analytics domain.


REFERENCES

Wang, K. T. K. (2020, September 29). Five Design-Sheet Methodology Approach to Data

Visualisation. Medium. https://towardsdatascience.com/five-design-sheet-methodology-

approach-to-data-visualisation-603d760f2418

You might also like