Steel - Case Study
Steel - Case Study
Ashu Sharma
Manoharan Divyalakshmi
Neha Arora
Sapna Bhandari
Praveena Venu V
Strips
The variables “THICKNESS” & “WIDTH” denote the number of
strips from the cluster, respectively. However, the total no of strips
in the 3 width clusters had to be the same as the total no. of strips
in the 3 thickness clusters.
RTR theory Run Time = Operating Time *– Non- Operating time (i.e., breakdowns, exceeding downtime
for maintenance / set-up time) RTR THEORY:
The Lower the RTR , the higher the negative
Non-Operating time directly
RUN TIME RATIO RTR = Run Time / Operating Time deviation from the plan
influenced the RTR
MPT theory MPT Theory defines about the material structure is favorable/unfavorable or it is a metric MPT THEORY:
of the structure. In general material with a low thickness and / or low width carries a lower The Lower the MPT , the higher the negative
weight per meter. It takes longer to put 1T of material through the production line, if the
Meters Per Ton deviation from the plan
process speed remained constant
• What is the average number of strips per shift?
Total average number of strips is 36.83 ( Reference from business case: The total no of strips in the 3 width clusters had to
be the same as the total no. of strips in the 3 thickness clusters.)
Mon Tue Wed Thu Fri Sat Sun 180 39.00 106 42.00
38.50 104
80 38.45 104
175 40.00
70 38.00 102 39.60
Total Average 170
102
37.50
60 36.83 mm 36.24 100 38.15 38.00
37.00 100 99
50 165 36.79
36.50 98 36.00
40 160 35.91
76.0074.0074.0074.00
70.0063.0069.00 174 36.00 96
30 170 95 34.00
155 35.50
35.74 94 33.59
20 37.14 37.11 39.51
33.99 35.00
Total Shifts 150 156 32.00
10 34.50 92
500 33.67 36.68 39.08
0 145 34.00 90 30.00
Sum of Total Shifts Average of Total Thickness E M N shift 1 shift 2 shift 3 shift 4 shift 5
Row Labels Sum of Total Shifts Average of Total Thickness Row Labels Sum of Total Shifts Average of Total Thickness Row Labels Average of Total Thickness2
Mon 70 37.14 E 156 35.74 shift 1 33.59
Tue 63 33.67 M 174 36.24 shift 2 35.91
Wed 69 33.99 N 170 38.45 shift 3 38.15
Thu 76 36.68 Grand Total 500 36.832 shift 4 36.79
Fri 74 37.11 shift 5 39.60
Sat 74 39.08 Grand Total 36.832
Sun 74 39.51
Grand Total 500 36.83
Steel$Total_Strips<-(Steel$thickness.1+Steel$thickness.2+Steel$thickness.3)
> Average_Strips<-mean(Steel$Total_Strips)
> Average_Strips
[1] 36.832
• Strip of which thickness cluster are the most common, and strips of which
thickness cluster are the least common?
Thickness 2 cluster is the most common, around 54.7% of the total strips
Thickness 3 cluster is the least common : 12.98%
Shift 1 output is very low compared to other shift groups
106 25.00
Sum of delta throughput
104 21.76 21.48
20.24 20.00
18.98 Grand Total 7,892.53
102 18.33
100 15.00 shift 5 4,561.61
12.75 13.38
98 12.45
104 shift 4 1,554.66
10.63 10.15 10.00
96 102
100 shift 3 4,357.72
99
94
4.83 5.46 4.88 5.00
4.74
3.98 95 shift 2 1,177.93
92
90 - (3,759.39) shift 1
shift 1 shift 2 shift 3 shift 4 shift 5
No. of Shifts thickness 1 thickness 2 thickness 3 (6,000.00) (4,000.00) (2,000.00) - 2,000.00 4,000.00 6,000.00 8,000.00 10,000.00
A=sum(Steel$thickness.1)/sum(Steel$Total_Strips)*100
> B=sum(Steel$thickness.2)/sum(Steel$Total_Strips)*100
> C=sum(Steel$thickness.3)/sum(Steel$Total_Strips)*100
54.70 >
>A
32.32 [1] 32.31972
>B
[1] 54.70243
>C
12.98 [1] 12.97785
>
>
> barplot(c(A,B,C),col=c("gray","lightblue","blue"),main="Thickness values",xlab="Thickness
type",ylab=" thickness in mm ",legend=c("thickness1","thickness 2","thickness 3"))
• What are the min, max, and average values of delta throughput and RTR?
Delta Throughput RTR
MIN Average MAX MIN Average MAX
1,000.00 120.00
par(mfrow=c(1,2))
800.00 730.28 681.18 100.00 100.00 100.00 100.00 100.00
571.56 100.00 > boxplot(summary(Steel$delta.throughput),col="red",main="Delta
600.00 499.14 Throughput")
358.21 80.00
400.00 > boxplot(summary(Steel$RTR),col="blue",main="RTR")
200.00 60.00 54.00
- 44.40 44.80
shift 1 shift 2 shift 3 shift 4 shift 5 40.00 > Delta_throughput_Summary
(200.00) 30.60
21.70 Min. 1st Qu. Median Mean 3rd Qu. Max.
(400.00) (353.37) 20.00
(390.59) -661.83 -132.69 16.88 15.79 156.34 730.28
(600.00) > Run_Time_Ratio
(588.66) -
(800.00) (655.81) (661.83) shift 1 shift 2 shift 3 shift 4 shift 5 Min. 1st Qu. Median Mean 3rd Qu. Max.
21.70 81.30 88.50 85.78 93.50 100.00
Min of delta throughput Max of delta throughput Min of RTR Max of RTR2
• Are there shifts during which the PPL processes strips of only steel grade 1, or of only steel
grade 2, etc.?
> grade_1<-table(Steel$grade.1==100)
Shifts in which PPL processes 100% of a specific grade only : > grade_1
Grade 1: 5 shifts , Grade 4: 8 shifts, Grade 5:1 shift, Grade rest: 9 shifts FALSE TRUE
495 5
Total: 23 shifts >
Delta Throughput > grade_2<-table(Steel$grade.2==100)
> grade_2
FALSE
1,399.83 500
>
> grade_3<-table(Steel$grade.3==100)
> grade_3
FALSE
Grade 1 Grade 4 Grade 5 Grad e Res t 500
(588.66) >
(948.13) > grade_4<-table(Steel$grade.4==100)
> grade_4
FALSE TRUE
(2,080.87) 492 8
>
> grade_5<-table(Steel$grade.5==100)
> grade_5
23 FALSE TRUE
499 1
8 9 >
5 > grade_rest<-table(Steel$grade.rest==100)
1 > grade_rest
FALSE TRUE
491 9
> grade_table=c(grade_1[2],grade_4[2],grade_5[2],grade_rest[2])
> grade_table
TRUE TRUE TRUE TRUE
5 8 1 9
• Can the RTR theory adequately explain the deviations from the planned production
figures?
• RTR theory : The Lower the RTR , the higher the negative deviation from the plan
• The graphs evidently indicates that as RTR increases, Delta throughput positive value also get increased. At higher RTR, delta throughput increases, or the actual production is more than the
planned production.
• For lower RTR values, there is a negative deviation from the plan i.e., Delta throughput is negative, or the actual production is lower than the planned production. This is positive linear
regression.
• It clearly explain that if the shift runs efficiently, obviously the delta throughput is high and actual production is more than the planned production.
90.10 87.50
81.86 85.13 84.15
Delta Throughput Vs RTR 'RTR'
120
43.58 43.86
100
600
16.36 80
11.55
400
60
RTR
shift 1 shift 2 shift 3 shift 4 shift 5
40
delta.throughput
200
(37.97) 20
0
0
178.99 -41.99-323.46 51.69 -35.84 60.9 209.36 -92.49 2.11 -49.96 61.48
Average of RTR3
Average of delta throughput delta throughput
-200
Statistically significant positive correlation between RTR and delta throughput (Correlation coefficient is +
-600
0.5568).
The linear regression model shows a positive correlation between the RTR and delta throughput, meaning
20 40 60 80 100 that when a shift is run efficiently with no production problems, the delta throughput is high
RTR
• Can the RTR theory adequately explain the deviations from the planned production
figures?
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.556778 • p-value is less than 5%, since RTR is the significant
R Square 0.310002 predictor for Delta Throughput
Adjusted R Square 0.308616
Standard Error 178.2591
Observations 500 • But from Model Adequacy point of view, the R2 is just
31% & RSE is very high, which means there are other
ANOVA
df SS MS F Significance F
significant predictors also for delta throughput.
Regression 1 7109656.101 7109656 223.7409081 4.77167E-42
Residual 498 15824592.69 31776.29
Total 499 22934248.79 • The scatter plot shows Residual result of regression
model. The funnel shape represents in the Predicted
Coefficient Upper
s Standard Error t Stat P-value Lower 95% 95% delta throughput Vs Residuals. Hence the data has
Intercept -865.5 59.45433252 -14.5574 2.97101E-40 -982.3126024 -748.688 HETEROSCEDASTICITY.
RTR 10.27369 0.686837181 14.95797 4.77167E-42 8.924235122 11.62315
• Heteroscedasticity refers to the situation where the
RTR Residual Plot
variance of the errors in a regression model is not
800
600
constant across all levels of the independent variable(s).
400 This can lead to biased and inefficient estimates of the
regression coefficients and can affect the validity of
Residuals
200
0
10 20 30 40 50 60 70 80 90 100 110 statistical tests and confidence intervals.
-200
-400
-600
RTR
• Can the RTR theory adequately explain the deviations from the planned production
figures?
# Delta _throughput vs RTR
> cor(RTR,delta.throughput) # correlation coefficient > # Scatter Plots with Fitted Regression Lines
[1] 0.556778 > # Delta _throughput vs RTR
> # Fitting the Simple Linear Regression Models > plot(RTR,delta.throughput,main="Delta throughput Vs RTR",col="red",lwd=4)
> > # For adding the fitted regression line
> # Model 1: Delta_Throughput on RTR > abline(mod_1,lwd=5,col="orange")
> mod_1=lm(delta.throughput~RTR,data=Steel) >
> summary(mod_1) > # 95% Confidence Interval for the Model Parameters
> confint(mod_1)
Call: 2.5 % 97.5 %
lm(formula = delta.throughput ~ RTR, data = Steel) (Intercept) -982.312602 -748.68811
RTR 8.924235 11.62315
Residuals:
Min 1Q Median 3Q Max
-823.70 -130.00 1.61 118.48 585.88
Coefficients: >
Estimate Std. Error t value Pr(>|t|)
(Intercept) -865.5004 59.4543 -14.56 <2e-16 ***
RTR 10.2737 0.6868 14.96 <2e-16 *** >
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 178.3 on 498 degrees of freedom
Multiple R-squared: 0.31, Adjusted R-squared: 0.3086
F-statistic: 223.7 on 1 and 498 DF, p-value: < 2.2e-16
Statistically significant positive correlation between RTR and delta throughput (Correlation coefficient is + 0.5568)
• MPT theory :
MPT Theory defines about the material structure is favorable/unfavorable or it is a
Delta throughput Vs MPT
metric of the structure. In general material with a low thickness and / or low width
carries a lower weight per meter. It takes longer to put 1T of material through the
production line, if the process speed remained constant
600
• So, negative deviations in months with average /above average RTR, could
be explained by this metric.
400
• High MPT figures are the cause for high negative delta throughput for shifts
delta.throughput
200
• But some other variables are causing additional deviation ( Since,R² = 0.4452)
-400
-600
20 40 60 80
MPT
• Is the MPT theory sufficient to explain the deviations? Explain why or why not.
SUMMARY OUTPUT
Regression Statistics
Multiple R 0.66726
R Square 0.445235
• p-value is less than 5%, since MPT is the significant
Adjusted R Square 0.444122 predictor for Delta Throughput
Standard Error 159.8387
Observations 500
• But from Model Adequacy point of view, the R2 is just
ANOVA 44.52% & RSE is very high, which means there are other
df SS MS F Significance F significant predictors also for delta throughput.
Regression 1 10211141.57 10211142 399.6781934 1.02517E-65
Residual 498 12723107.21 25548.41
Total 499 22934248.79
• The scatter plot shows Residual result of regression
Coefficient Upper model. The funnel shape represents in the Predicted
s Standard Error t Stat P-value Lower 95% 95%
Intercept 389.5816 20.01718301 19.46236 3.66564E-63 350.2530219 428.9101
delta throughput Vs Residuals. Hence the data has
MPT -10.87 0.543719739 -19.992 1.02517E-65 -11.93828696 -9.80175 HETEROSCEDASTICITY.
0
0 10 20 30 40 50 60 70 80 90 100
-200
-400
-600
-800
MPT
• Is the MPT theory sufficient to explain the deviations? Explain why or why not.
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 389.5816 20.0172 19.46 <2e-16 ***
MPT -10.8700 0.5437 -19.99 <2e-16 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Statistically significant Negative Correlation (-0.6673): If MPT increases, the negative deviation is higher
Independent Basis:
Going forward with : Regression Analysis #3
Variables : 1. p-values are very low for all the independent variables
RTR 2. Multiple R2 and adjusted R2 difference is the least i.e. 0.0017
Thickness 1,2,3 3. Although RSE is 78.88%, F-statistic value being high and p-values
Width 1,2 being very low, gives confidence to consider this further
200
200
Sample Quantiles
Sample Quantiles
0
0
-200
-200
-400
-400
Analyzing the Residuals: Residuals are standardized by subtracting the mean of residuals from each residual value and then dividing by standard
deviation of residuals
4
200
2
Steel$res
0
0
stdr
-2
-200
-4
-400
Steel$pred Steel$pred
• Schulze’s theory: Interpretation
data: stdr
W = 0.97012, p-value = 1.448e-08
- Since VIF value does not exceed 5 or 10 , this case
does not exhibit problematic multicollinearity
Studentized Breusch Pagen Test
100.00 50.00
43.58 43.86
• Shifts: 90.00
80.00
40.00
30.00
• Shift 1 need to improve the productivity, comparatively others 70.00
16.36
20.00