Data-Enabled insight from Sericulture
Jayalxmi Agro Tech
By Hustlers (4A)
Soumya Tiwary (140040)
Anurag (140007)
Monu Kumar (140024)
Sneha Anand (140039)
Shantanu Sinha (140034)
Monik (140087)
Summary
Jayalaxmi Agro Tech (JAT), a company based out of Bellary in Karnataka and
co-founded by Anand Babu, strives to keep the Indian farmer informed about the
modern best practices, thereby boosting the agricultural yield. The company's
flagship product is a suite of crop-specific mobile apps in several regional
languages with heavy emphasis on audio-visual content to break the language and
literacy barrier prevalent in rural areas. The farmer is empowered with the right
information at the right time to make agriculture sustainable and more profitable.
JAT intended to collect data on sericulture (rearing of silkworms for producing
raw silk) to improve income of silk producers. Karnataka is one of the largest
producers of raw silk in India. Sericulture requires less investment but offers high
returns if done correctly. Sericulture also involves cultivation of mulberry trees,
the leaves of which are used to feed the silkworms. The yield of sericulture is
heavily dependent on the quality of inputs such as the type of silkworm breed
used, quality of the mulberry leaves and environmental conditions of the
silkworm rearing house. Jayalaxmi Agro Tech collected farmer level data on
sericulture practices in the districts of Belagavi, Bellary, Chikballapur, Mandya,
and Tumakuru in the state of Karnataka. The company wanted to analyze the data
collected to gain insights so that they could make grassroot level impact by fine
tuning sericulture as an occupation. These insights could possibly help towards
building better policy interventions to improve the welfare of sericulture farmers.
Analysis of Sericulture data from Karnataka:
The data collected from sericulture farmers can be used for conducting simple
hypothesis tests and build regression models to understand the factors that are
associated with income generated in agriculture. The idea is to share insights with
the sericulture farmers so that they can take informed decisions.
Hypothesis Development:
As we are intending to study the factors that affect the income generated in
agriculture. So, we will go for a multiple linear regression model.
Income per acre is going to be our dependent variable (Y), and the other factors
are going to be independent variables. I have chosen the following independent
variables.
• Training on sericulture. (training_on_sericulture)
• Crop insured. (crop_insured)
• Amount of loan taken by the farmer. (loan_amount)
• Total subsidy received from the sericulture department.
(seri_total_subsidy)
• Cost of rearing silkworms. (rearing_cost)
• Cost of managing the instruments required for sericulture.
(instrument_mgmt_cost)
• Number of years of experience of the farmer in sericulture.
(years_of_exp_in_sericulture)
• Biofertilizers used. (bio_fertilizers)
• Farmer is mechanized. (mechanization)
Hypothesis development is important because it enables us to develop a specific
direction as well as better understanding about the subject matter of the study.
Null Hypothesis
Income is not affected by Training on sericulture, Crop insured, Amount of loan
taken by the farmer, Total subsidy received from the sericulture department,
Cost of rearing silkworms, Cost of managing the instruments required for
sericulture, Number of years of experience of the farmer in sericulture,
Biofertilizers used, and Farmer mechanization.
Alternate Hypothesis:
Income is affected by Training on sericulture.
Income is affected by Crop insurance.
Income is affected by Amount of loan taken by the farmer.
Income is affected by Total subsidy received from the sericulture department.
Income is affected by Cost of rearing silkworms.
Income is affected by Instruments management cost.
Income is affected by Years of experience of farmer.
Income is affected by Biofertilizers.
Income is affected by the Mechanization of a farmer.
Methodology:
I will be using Multiple Linear Regression. I filtered my data for dependent and
independent variables. Then I applied multiple linear regression model and the
test results are shown as fo
llows.
This Multiple R is the correlation coefficient which tells us how strong the
linear relationship is. In this case its about 53%. This means that the linear
relationship between variables is 53% strong and the variables have a weak
positive relationship.
Standard error is 33227 which means that this much change in the income of
farmers is not because of the taken independent variables. Other factors might
have caused this change which we have not considered. It’s the error term.
When there are more than 1 independent variables, so we consider adjusted R
Square. Here the value is 26% which means that the independent variables
explain about 26% of the variation in dependent variable, and the independent
variables cause 26% of change in our dependent variable.
The total number of observations are 508.
Since the Significance F-value is less than 5% so we reject the null hypothesis
and say that the income is affected by most of our independent variables.
We will now analyze each of our independent variable and will see how they
respond to income. The y-intercept value represents what would be the value of
Y when the X’s are Zero
Rearing:
The p-value of rearing cost is less than 5% so we will reject the null hypothesis
and will say that it has an effect on the income of the farmer per acre. The
coefficient of rearing cost is positive which means that there is a positive
correlation between income and rearing cost. For every 1 rupee rise in rearing
cost, the income will increase by 0.224 rupees. The t-stat value is 3.14 which is
greater than 2 and we can reject the null hypothesis.
Crop insured:
Its coefficient is negative which means that there is a negative correlation
between income and the insurance of crops. P-value clearly indicates that we
can reject the null hypothesis and state that there exists a relationship between
these 2 variables. The insurance of crops will reduce the income by 14906.
Years of experience:
The p-value of this independent variable is more than 5% so in this case we
cannot reject the null hypothesis and say that the income is not as such affected
by the years of experience of a farmer.
Training on sericulture:
The p-value in this case is less than 5% so we can reject the null hypothesis and
state that there exists a relationship between income and training on sericulture.
Income is impacted by this variable. The relationship is negative as obvious by
the sign of coefficient.
Biofertilizers and mechanization:
As the p-values of both these variables are less than 5% so we reject the null
hypothesis and state that the income is impacted by these two variables. The
impact is positive as their coefficients are positive.
Loan amount:
The p-value in this case is quite less than 5% so we reject the null hypothesis
and state that there is a relationship between loan amount and income. As the
coefficient is negative, which states that the relationship is also negative.
Subsidy received and instrument management cost:
As the p-value of these two variables is quite higher than 5% so we cannot
reject the null hypothesis and we conclude that the income is not affected by
these two variables.
Recommendations:
The rearing cost should be increased as its positively related to the income of
the farmer. The insurance of crops should be decreased as it negatively impacts
the income. The use of biofertilizers should be increased and farmers are
supposed to be more mechanized in order to enhance the income. The loan
amount negatively impacts the income of the farmer so it should also needs to
be decreased.
Difference in income of farmers by Districts:
Null hypothesis:
Income of farmers in all the districts is same.
Alternate hypothesis:
Income of farmers in all the districts is not same.ethodology:
As the number of variables is more than 2 and variable under
consideration is quantitative so we will use Anova Test.
Methodology
We run the anova test in excel by arranging the incomes
according to their districts and produce the following results.
As the Test statistic is grater than F critical value so we can easily reject the
null hypothesis and state that there lies a difference in income by districts.
Every district farmer has different income
Average Income of Jayalaxmi Agro Tech (JAT)
Hypothesis
Null Hypothesis
Jayalaxmi Agro Tech (JAT) believes that the average income per acre from
sericulture is at least Rs. 35,000
Alternate Hypothesis
Jayalaxmi Agro Tech (JAT) believes that the average income per acre from
sericulture is more than Rs. 35,000
Methodology
Conclusion
Do not reject H0
Therefore, we can conclude that the assumption for average income at least ₹ 35000 per acre
from sericulture is correct
Also, there is insufficient evidence to support the alternate hypothesis
Average Income of Jayalaxmi Agro Tech (JAT)
(σ Unknown)
Hypothesis
Null Hypothesis
Jayalaxmi Agro Tech (JAT) believes that the average income per acre from
sericulture is at least Rs. 35,000
Alternate Hypothesis
Jayalaxmi Agro Tech (JAT) believes that the average income per acre from
sericulture is more than Rs. 35,000
Methodology
Conclusion
Do not reject the H0 hypothesis
Therefore we can conclude that the assumption for average income at least ₹
35000 per acre from sericulture is correct.
Gender disparity among sericulturists in
Karnataka.
Hypothesis
The claim is that the proportion of female sericulturists is less than 15%.
Null Hypothesis
H0: ≥ In Karnataka, the proportion of female sericulture is less than 15%
Alternate Hypothesis
Ha < In Karnataka, the proportion of female sericulture is greater than or equal to 15%
Methodology
We have run a T-Test on excel and gained the result as follows
Conclusion
Do not reject H0
We conclude by saying that the claim that the proportion of female sericulturists
is less than 15% is True.