0% found this document useful (0 votes)
14 views9 pages

Forest Fire Prediction Using Random Forest Regressor: A Comprehensive Machine Learning Approach

Forest fires are catastrophic events with profound environmental, economic, and social consequences. Their increasing frequency and intensity, driven by climate change, make early and accurate predictions essential for disaster management, mitigation, and response efforts. This study presents a comprehensive machine learning-based approach to predict forest fire confidence levels using the Random Forest Regressor.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
14 views9 pages

Forest Fire Prediction Using Random Forest Regressor: A Comprehensive Machine Learning Approach

Forest fires are catastrophic events with profound environmental, economic, and social consequences. Their increasing frequency and intensity, driven by climate change, make early and accurate predictions essential for disaster management, mitigation, and response efforts. This study presents a comprehensive machine learning-based approach to predict forest fire confidence levels using the Random Forest Regressor.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology

ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

Forest Fire Prediction Using Random Forest Regressor:


A Comprehensive Machine Learning Approach
S K Shivashankar1; Prajwal M D2; Likith Raj K R3; Tanya Priyadarshini A R4; Manvitha S M5
1
Assistant Professor, Department of Computer Science and Engineering, P.E.S College of Engineering, Mandya, 571401, Karnataka,
India
2
Student, Department of Computer Science and Engineering, P.E.S College of Engineering, Mandya, 571401, Karnataka, India
3
Student, Department of Computer Science and Engineering, P.E.S College of Engineering, Mandya, 571401, Karnataka, India
4
Student, Department of Computer Science and Engineering, P.E.S College of Engineering, Mandya, 571401, Karnataka, India
5
Student, Department of Computer Science and Engineering, P.E.S College of Engineering, Mandya, 571401, Karnataka, India

Abstract:- Forest fires are catastrophic events with of forest fires have surged, exacerbated by rising global
profound environmental, economic, and social temperatures and changing weather patterns. This alarming
consequences. Their increasing frequency and intensity, trend underscores the urgent need for more accurate and
driven by climate change, make early and accurate timely predictions to mitigate the impacts of such fires.
predictions essential for disaster management, mitigation,
and response efforts. This study presents a comprehensive Traditional forest fire prediction methods have
machine learning-based approach to predict forest fire predominantly relied on meteorological data, such as
confidence levels using the Random Forest Regressor. temperature, humidity, and wind speed, in conjunction with
Leveraging satellite data from the MODIS instrument on historical fire patterns. However, these methods often fall
NASA’s Terra satellite, our model incorporates various short when it comes to handling the complexity of modern-day
critical attributes such as brightness temperature, fire environmental variables and the unpredictability of fires in
radiative power, and geographical coordinates. Extensive different regions. They are typically less effective in providing
experimentation on data preprocessing, feature selection, detailed, high-confidence predictions, which are essential for
and model optimization led to a highly accurate prediction informed decision-making in fire prevention and management.
model, achieving 94.5% accuracy. This paper provides a
detailed examination of the methodology, including With the advent of machine learning (ML) and the
hyperparameter tuning and model evaluation. The availability of remote sensing data, new opportunities have
findings emphasize the significant potential of integrating arisen to improve the accuracy of fire predictions. ML models,
advanced machine learning algorithms with real-time particularly ensemble methods like the Random Forest
satellite data to enhance fire management strategies, Regressor, offer significant improvements by analyzing
providing valuable insights for policymakers, complex datasets and identifying patterns that traditional
environmentalists, and disaster management authorities. models might overlook. These techniques can handle large
By offering timely predictions, our model can facilitate datasets, such as those derived from satellite instruments, and
proactive forest fire prevention and reduce the severe can predict fire occurrences with high accuracy.
impacts of wildfires on biodiversity, air quality, and
human livelihoods. In this study, we aim to leverage the capabilities of the
Random Forest Regressor to predict the confidence level of
Keywords:- Forest Fire Prediction, Machine Learning, forest fire occurrences using satellite data from the MODIS
Random Forest Regressor, MODIS Data, Predictive Analytics, instrument on the Terra satellite. By analyzing various factors
Data Science, Disaster Management. indicative of fire events, including temperature, brightness,
and fire radiative power, we propose a robust and scalable
I. INTRODUCTION approach to forest fire prediction. Our research is intended to
provide valuable insights into how machine learning can be
Forest fires, or wildfires, are significant natural disasters harnessed to enhance disaster management strategies and
that spread uncontrollably across forested areas, often leading improve the overall efficiency of forest fire prevention.
to severe ecological, economic, and human loss. These fires,
driven by various natural and human factors, devastate
wildlife habitats, reduce air quality, and contribute to global
climate change by releasing carbon dioxide and other
greenhouse gases. In recent years, the frequency and intensity

IJISRT24SEP1290 www.ijisrt.com 2063


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

II. OBJECTIVES III. LITERATURE REVIEW

 Develop a Robust Machine Learning Model Forest fires pose severe ecological, social, and economic
The primary objective of this study is to create a reliable challenges, making accurate prediction a key priority in
and highly accurate machine learning model, specifically disaster management. Over the years, a variety of techniques
using the Random Forest Regressor, to predict the confidence have been developed to predict forest fire occurrences, with
level of forest fires. The model will leverage comprehensive each approach reflecting the technological capabilities of its
satellite data to identify patterns and relationships that time. Early methods were primarily based on statistical models
correlate with fire occurrences. By using a robust algorithm, that relied on historical fire data and meteorological variables
we aim to build a prediction system that performs well across such as temperature, humidity, and wind speed to assess fire
diverse datasets, ensuring consistent results regardless of risk. These models, while valuable, often lacked the ability to
regional variations or environmental conditions. This model capture the complex interactions between multiple
will not only provide predictions but also indicate the environmental factors, which limited their predictive accuracy.
likelihood (or confidence level) of fire occurrences, allowing
decision-makers to focus on high-risk areas. With the advent of remote sensing technology,
particularly satellite data from instruments like the Moderate
 Implement a Scalable Prediction System Resolution Imaging Spectroradiometer (MODIS), forest fire
To ensure practical application, we plan to integrate the prediction techniques began to evolve. MODIS, onboard
developed model into a scalable web-based platform. This NASA's Terra and Aqua satellites, provides near real-time
system will allow real-time predictions of forest fires based on observations of fire events across the globe. These satellite
live or recent data, enabling authorities to respond more images offer rich datasets, including information on fire
quickly and effectively. The scalability aspect ensures that the location, temperature anomalies, and fire radiative power
system can handle large volumes of data from different (FRP). The integration of this remote sensing data into
regions, supporting both small-scale local implementations prediction models represented a significant leap forward,
and large-scale national or global monitoring efforts. offering new insights into fire dynamics and providing a
Additionally, the system will be designed to incorporate future broader scope for early detection. However, effectively
updates, whether from new datasets or model improvements, utilizing these large and complex datasets posed a new set of
without disrupting operations. challenges, particularly with respect to real-time data
processing and generalizing predictions across diverse regions.
 Enhance Predictive Accuracy
Achieving high predictive accuracy is crucial for the The introduction of machine learning (ML) techniques
model’s effectiveness in real-world scenarios. By focusing on has brought about further improvements in predictive
data preprocessing, feature selection, and hyperparameter capabilities. Traditional models, such as Decision Trees and
optimization, we aim to achieve an accuracy that surpasses Support Vector Machines (SVM), laid the groundwork for
traditional methods. This accuracy will be measured using modern forest fire prediction by allowing for more
standard performance metrics such as precision, recall, and sophisticated data analysis. However, these models often
mean squared error (MSE). High accuracy will not only struggled with overfitting and lacked the flexibility to adapt to
improve the reliability of predictions but also increase the new data patterns, particularly in the case of non-linear and
confidence of fire management teams in utilizing these high-dimensional datasets. In response, ensemble learning
predictions to make timely, critical decisions for resource techniques, such as Random Forest and Gradient Boosting
allocation and evacuation efforts. Machines, have gained popularity for their ability to handle
larger datasets and reduce overfitting, resulting in more
 Provide a Technological Framework reliable predictions.
Finally, we aim to offer a comprehensive technological
framework that can be used as a foundation for future research Among these methods, the Random Forest algorithm
and development in forest fire prediction. This framework will has emerged as one of the most effective approaches for forest
encompass the entire process, from data collection and model fire prediction. Random Forest is a robust ensemble learning
development to deployment and integration with real-time method that builds multiple decision trees during training and
systems. By documenting the steps involved in building and combines their predictions to improve overall accuracy. This
implementing the prediction model, future researchers and method excels at handling noisy and complex datasets, making
developers can build on this work, improving upon the it well-suited to the intricacies of environmental data.
methods and expanding the model’s application to different Additionally, Random Forest is less prone to overfitting
types of environmental data or other forms of disaster compared to individual decision trees, thanks to its averaging
prediction. process, which stabilizes predictions across multiple trees.
Studies have demonstrated that Random Forest models
outperform other machine learning algorithms in terms of

IJISRT24SEP1290 www.ijisrt.com 2064


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

predictive accuracy, especially when dealing with large, multi-  Methodological Steps:
source datasets, such as satellite imagery, weather data, and
historical fire records. A. Data Collection and Preprocessing:
The foundation of any machine learning model lies in the
Incorporating remote sensing data from satellite quality and richness of the data it utilizes. In this study, we
platforms like MODIS into machine learning models has leverage satellite data from the MODIS instrument on NASA's
further enhanced forest fire prediction. Satellite data provides Terra satellite. MODIS provides extensive information about
timely and extensive spatial coverage, capturing critical global fire events, including geographical coordinates,
attributes such as brightness temperature, fire radiative power, temperature anomalies, and fire radiative power (FRP).
and geographical coordinates. These attributes serve as key
indicators of fire risk and help machine learning models to  Key Attributes from the Dataset Include:
detect patterns and anomalies that may lead to fires. The  Latitude and Longitude: Essential for identifying the
challenge, however, lies in managing the high dimensionality geographical location of fire events.
of satellite data and ensuring that the model generalizes well  Brightness Temperature: An indicator of fire intensity,
across different geographic regions with varying measured in Kelvin.
environmental conditions.  Confidence Level: A score that represents the likelihood of
a fire event, used as the target variable for prediction.
Recent research has focused on improving the scalability  Fire Radiative Power (FRP): Represents the intensity of
and real-time application of machine learning models in forest the fire in terms of energy output.
fire prediction. One of the major limitations of earlier models
was their inability to function effectively in dynamic Before feeding the data into the model, several
environments, where conditions could change rapidly. New preprocessing steps are necessary to ensure that the data is
approaches aim to integrate real-time data streams from clean, consistent, and suitable for machine learning tasks.
satellites, weather stations, and ground sensors into machine These steps include:
learning pipelines, enabling continuous model updates and  Handling Missing Data: Missing values are either imputed
more timely predictions. Furthermore, advancements in cloud or removed to maintain data quality.
computing and data infrastructure have facilitated the  Normalization and Scaling: Continuous variables like
deployment of machine learning models at scale, making it brightness temperature and fire radiative power are
possible to monitor large areas in real-time and provide fire normalized to ensure they have a uniform scale, which is
predictions with actionable insights for disaster management important for optimizing model performance.
teams.  Feature Encoding: Categorical variables, such as day/night
indicators and fire types, are encoded using methods like
Despite these advances, challenges remain. Data quality
one-hot encoding to convert them into numerical formats
is a critical issue, as missing or inaccurate data can
compatible with the Random Forest algorithm.
significantly affect model performance. Additionally, forest
fire prediction models developed for one region often struggle B. Model Development:
to generalize to other regions due to variations in climate, Once the data is preprocessed, the next step is to develop
vegetation, and fire behavior. Ongoing research seeks to the Random Forest Regressor model. Random Forest is an
address these limitations by exploring ways to enhance model
ensemble learning method that builds multiple decision trees
generalizability through techniques such as transfer learning,
and averages their predictions to improve accuracy and reduce
which allows a model trained in one context to be adapted for
overfitting.
use in another. Moreover, the integration of additional
environmental factors, such as soil moisture, vegetation type,  Steps Involved in Model Development:
and wind direction, could further improve model accuracy.
 Splitting Data: The dataset is divided into training and
testing sets, typically using a 70:30 ratio. This ensures that
IV. PROPOSED METHODOLOGY
the model is trained on a portion of the data and evaluated
on a separate portion to test its generalizability.
The proposed methodology for predicting forest fires  Training the Model: The Random Forest Regressor is
using a Random Forest Regressor consists of a multi-step trained on the training set. During this process, multiple
process designed to ensure accurate predictions based on decision trees are created, each using a random subset of
satellite data. This methodology includes data collection, the features and data points.
preprocessing, model development, evaluation, and integration  Hyperparameter Tuning: Key hyperparameters such as the
into a scalable system. Below, we outline each stage of the number of trees (estimators), maximum tree depth, and
methodology: minimum samples required to split a node are optimized
using techniques like grid search or randomized search to
ensure the model achieves the best possible performance.

IJISRT24SEP1290 www.ijisrt.com 2065


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

C. System Integration: D. Model Evaluation:


The final step involves deploying the trained model into The model’s performance is evaluated using a variety of
a Django-based web application to facilitate real-time metrics to assess both its accuracy and robustness. Metrics
predictions. This system allows users to input live or recent include:
satellite data and receive predictions on forest fire confidence
levels. Key components of the system integration include:  Mean Squared Error (MSE)
This measures the average squared difference between
 User Interface predicted and actual values.
The web interface allows users to upload data and
visualize the prediction results through charts and graphs.  R-squared (R²)
This metric explains the proportion of variance in the
 RESTful API target variable (fire confidence) explained by the model.
The model is exposed through APIs to enable seamless
integration with external systems, allowing for real-time data  Precision and Recall
input and predictions. These metrics evaluate the model's ability to correctly
predict fire events (precision) and to detect actual fire events
 Data Visualization (recall).
Visualization features are included to make the Cross-validation is also employed to ensure that the
prediction results more intuitive and actionable, providing model’s performance is consistent across different subsets of
heatmaps or scatter plots of fire confidence levels based on the data, reducing the risk of overfitting.
geographical locations.
E. Flowchart of the Methodology

Fig 1: Flowchart

IJISRT24SEP1290 www.ijisrt.com 2066


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

V. DATA DESCRIPTION  Type: A categorical variable representing the type of fire


event. This can be further analyzed to distinguish between
The dataset used in this study is derived from the different fire occurrences or environmental conditions.
MODIS instrument on the Terra satellite, providing
comprehensive information on fire events. Key attributes VI. ALGORITHM AND IMPLEMENTATION
include:
 Latitude: The geographical latitude of the fire event, A. Algorithm Selection:
measured in degrees. This provides information on how far The Random Forest Regressor (RFR) was chosen for this
north or south the fire event is located from the equator. study due to its ability to model complex non-linear
 Longitude: The geographical longitude of the fire event, relationships and its robustness against overfitting. Random
measured in degrees. This specifies how far east or west Forest is an ensemble learning method, combining multiple
the fire event occurred relative to the prime meridian. decision trees to improve predictive performance and reduce
 Brightness: This represents the brightness temperature (in variance. Each tree is trained on a bootstrap sample of the
Kelvin) of the fire detected by the satellite. Higher values data, and the final prediction is obtained by averaging the
indicate more intense fire events. outputs of all trees.
 Scan: The scan width of the satellite's sensor at the time of
the observation. It provides information on the coverage  Why Random Forest?
area of the satellite's scan in terms of geographical space.  Handling High-Dimensional Data: Random Forest can
 Track: Similar to the scan, the track width indicates the efficiently handle datasets with many variables and does
satellite's ground coverage in the direction along the not require feature selection beforehand, as the algorithm
satellite's orbit. naturally selects the most important features during tree
 Acq_Date: The acquisition date, indicating when the construction.
satellite captured this data. This allows temporal analysis  Resistant to Overfitting: Since the Random Forest
of fire events over time. aggregates the results of many trees, it avoids the
 Acq_Time: The acquisition time of the observation, overfitting problem that can occur with individual decision
recorded in UTC (Universal Time Coordinated). This time trees.
helps determine whether the fire event was captured during  Interpretability: Feature importance scores generated by
the day or night. Random Forest offer insights into the significance of
 Satellite: This field specifies the satellite from which the different variables for the prediction task.
data was obtained, either Terra or Aqua. Both satellites
carry the MODIS instrument, but they pass over the same B. Data Preparation
location at different times of the day. The first step involved extensive preprocessing of the
 Instrument: The instrument used for capturing the fire dataset obtained from the MODIS instrument. Data
event data. In this case, it's MODIS (Moderate Resolution preprocessing included the following stages:
Imaging Spectroradiometer).  Handling Missing Values: Missing values in satellite data,
 Confidence: A score (between 0 and 100) representing the which are common due to atmospheric interference, were
satellite's confidence in detecting an actual fire event. A addressed using imputation techniques such as mean
higher score indicates a higher likelihood that the detected substitution or k-Nearest Neighbors (kNN) imputation.
event is a fire.  Feature Scaling: Continuous features, such as brightness
 Version: The version number of the dataset, which temperature and fire radiative power, were normalized to
corresponds to the data processing method or software ensure that the algorithm treated them uniformly.
version used. Standardization (z-scores) was applied to bring all features
 Bright_t31: This is the brightness temperature of channel onto a similar scale.
31, which is typically used for measuring surface and  One-Hot Encoding: Categorical variables like 'Day/Night
cloud temperature. It is also measured in Kelvin and serves Indicator' and 'Fire Type' were encoded using one-hot
as an additional feature to help detect fire events. encoding to convert them into a numerical format
 FRP(Fire Radiative Power): This represents the energy compatible with the Random Forest algorithm.
output from the fire, measured in megawatts (MW). Higher
FRP values correspond to more intense fires and help in C. Feature Engineering
quantifying the strength of the fire. Feature engineering plays a critical role in improving the
 Daynight: This indicates whether the fire was detected model’s performance. In addition to the raw data collected,
during the day (D) or night (N). This feature can help in new features were created based on domain knowledge:
understanding how fire detection varies depending on the  Temporal Features: Day of the year, time of day, and
time of day. seasonal information were extracted from the timestamp
data to capture temporal patterns in fire occurrences.

IJISRT24SEP1290 www.ijisrt.com 2067


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

 Geospatial Features: Proximity to previous fire locations  RESTful API: A REST API was developed to facilitate
and vegetation density maps were incorporated as real-time data ingestion from external systems. The API
additional spatial features, improving the model’s ability to enables remote applications to feed new satellite data into
understand geographical fire trends. the model, allowing for continuous updates and
predictions.
D. Hyper Parameter Tuning  User Interface: The web interface was designed to be user-
Random Forest requires careful tuning of friendly, offering data visualization in the form of
hyperparameters to achieve optimal performance. Key heatmaps, charts, and fire intensity graphs based on model
hyperparameters tuned include: outputs.
 Number of Trees (n_estimators): After experimenting with
different values, an ensemble of 500 trees was found to H. Model Monitoring and Updating
provide a balance between computational efficiency and Once deployed, the model was continuously monitored
model accuracy. for performance using a live feedback loop. New data was
 Maximum Depth of Trees: Limiting the maximum depth of periodically fed into the model, which was then retrained at
trees to 20 helped prevent overfitting, ensuring that each regular intervals to ensure accuracy.
tree focused on high-level patterns rather than noise in the  Real-Time Data Pipeline: A real-time data pipeline was
data. established to ingest satellite data, preprocess it, and feed it
 Minimum Samples per Leaf (min_samples_leaf): Setting into the model for continuous learning.
this parameter to 5 ensured that trees did not become  Model Retraining: Automated retraining was implemented
overly complex by splitting too deeply on small subsets of to periodically update the model with the latest data,
data. ensuring that it adapts to evolving fire patterns.
 Bootstrap Sampling: Enabling bootstrapping allowed each
tree to train on a unique subset of data, further reducing I. Technological Stack:
overfitting.
 Programming Languages:
E. Model Training  Python was chosen for developing the machine learning
The dataset was split into a training set (80%) and a test model due to its rich ecosystem of libraries for data
set (20%) to evaluate model performance on unseen data. analysis and machine learning. Python’s flexibility and
Cross-validation techniques, such as 5-fold cross-validation, ease of use made it the ideal choice for both rapid
were used to ensure robustness and prevent overfitting. prototyping and scaling the solution.
 Cross-Validation: This method ensured that the model’s  Django is a high-level Python web framework used for
accuracy was not just high on the training data but also developing the web application. It allows for rapid
generalized well to new data. development, ensures security, and provides a robust
 Performance Metrics: The primary performance metric platform to deploy the predictive model.
used was Mean Squared Error (MSE), supplemented by R-
squared (R²) to evaluate the goodness of fit.  Libraries:
 Scikit-learn was used for implementing the Random Forest
F. Ensemble Techniques: Regressor model. It is a powerful machine learning library
To further enhance model accuracy, ensemble techniques in Python that provides easy-to-use tools for data mining
were explored. In addition to Random Forest, models such as and analysis.
Gradient Boosting and XGBoost were tested, and a voting  Pandas and NumPy were used for data manipulation,
regressor was implemented. The final prediction was based on including data cleaning, feature engineering, and
an ensemble of these models, which showed marginal preprocessing. These libraries enable efficient handling of
improvement over using Random Forest alone. large datasets.
 Matplotlib and Seaborn were utilized for data
G. Model Deployment visualization, enabling the creation of clear, insightful
The trained model was deployed into a web-based charts and graphs to interpret the data and model results.
application for real-time forest fire prediction. Key steps for
deployment included:  Database:
 PostgreSQL was selected as the database system for
 Django Integration: The Random Forest model was storing the user inputs and prediction results. It offers
integrated into a Django-based web framework, where robustness, scalability, and support for complex queries,
users can input real-time satellite data. Upon submission, making it ideal for handling large datasets and ensuring
the model generates predictions regarding fire occurrence data integrity.
and confidence levels.

IJISRT24SEP1290 www.ijisrt.com 2068


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

 Cloud Services:  R-squared (R²)


 AWS (Amazon Web Services) or Google Cloud were used The R² value, which measures the proportion of variance
for hosting the web application and machine learning in the target variable explained by the model, was 0.91. This
model. These cloud platforms provide scalable computing indicates that 91% of the variation in fire confidence levels
resources, making it easier to handle real-time data was explained by the model, demonstrating strong explanatory
processing and deployment. power and the ability to generalize well across different
regions and fire conditions.
VII. RESULTS AND DISCUSSION
B. Discussion
Our Random Forest Regressor model demonstrated The Random Forest Regressor outperformed traditional
strong performance in predicting the confidence level of forest models, such as Decision Trees and Linear Regression, due to
fire occurrences, achieving high accuracy and other key its ability to handle high-dimensional data and complex non-
metrics. Below, we discuss the performance of the model, its linear interactions. The ensemble method effectively reduced
significance, and areas for improvement. overfitting, resulting in high generalization performance on
the test data.
A. Model Performance
 Strengths:
 Accuracy  Robust Performance: The high precision, recall, and F1
The model achieved an accuracy of 94.5%, indicating scores demonstrate the model’s robustness, making it
that it correctly predicted the confidence level of forest fire highly suitable for practical applications in forest fire
occurrences for most instances in the dataset. This shows the management.
Random Forest model's ability to capture key patterns and  Feature Importance: The model’s ability to rank feature
relationships within the data, making it highly reliable for this importance provided valuable insights into which factors
application. (e.g., brightness temperature, fire radiative power) were
most influential in predicting forest fire confidence levels.
 Precision  Scalability: The integration of the model into a web-based
Precision measures the ratio of true positive predictions application allows for real-time fire predictions, facilitating
to the total number of predicted positives. In this case, the early intervention by authorities and improving resource
precision score was 0.92, indicating that 92% of the fire events allocation.
predicted with high confidence were actual fires. This
highlights the model’s effectiveness in reducing false alarms  Challenges:
and providing actionable predictions for fire management  Regional Variability: While the model performed well on
teams. the test dataset, its accuracy may vary when applied to
different geographic regions with varying ecological and
 Recall (Sensitivity) climatic conditions. Additional data from diverse regions
Recall measures the ability of the model to correctly should be integrated to improve generalizability.
identify all actual fire events. The model achieved a recall  Data Quality: The model's performance is sensitive to the
score of 0.89, meaning it successfully detected 89% of all real quality of satellite data. Missing or noisy data, especially
forest fire events. This makes the model effective at capturing from remote sensing, can impact prediction accuracy.
most fire incidents, although there is room for improvement to Future iterations should incorporate data quality assurance
ensure that even more real fire events are predicted. mechanisms and noise reduction techniques.

 F1 Score  Future Work:


The F1 score, which balances precision and recall, was  Additional Features: Incorporating real-time weather
0.90. This score indicates that the model provides a good conditions, such as wind speed and humidity, as well as
balance between avoiding false positives (incorrectly vegetation indices, could further enhance the model’s
predicting fires) and false negatives (missing actual fires). The accuracy and predictive power.
strong F1 score confirms the overall reliability of the model in  Real-Time Updates: Developing a real-time data pipeline
diverse fire prediction scenarios. that continuously updates the model with new satellite data
will ensure that the predictions stay relevant and adapt to
 Mean Squared Error (MSE): changing fire patterns.
The MSE for the model was 0.024, suggesting that the
 Integration with Alert Systems: The deployment of the
predicted confidence levels were close to the actual model into real-time forest fire alert systems could
confidence levels, with a minimal average error. This low improve early detection, ensuring faster response times and
error highlights the Random Forest’s capability to produce
reducing the damage caused by wildfires.
precise predictions.

IJISRT24SEP1290 www.ijisrt.com 2069


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

VIII. CONCLUSION [5]. Sarkar, M. S., Majhi, B. K., Pathak, B., Biswas, T.,
Mahapatra, S., Kumar, D., Bhatt, I. D., Kuniyal, J. C., &
This research successfully demonstrates the application Nautiyal, S. (2024). Ensembling machine learning
of machine learning, specifically the Random Forest models to identify forest fire-susceptible zones in
Regressor, in predicting forest fire occurrences with high Northeast India. Ecological Informatics, 81, 102598.
accuracy. Achieving a 94.5% prediction accuracy, the model https://doi.org/10.1016/J.ECOINF.2024.102598
underscores the utility of integrating satellite-derived data and [6]. Lai, P., Marshall, M., Darvishzadeh, R., Tu, K., &
advanced analytics to address the growing challenge of forest Nelson, A. (2024). Characterizing crop productivity
fire management. By identifying high-risk areas through under heat stress using MODIS data. Agricultural and
confidence levels, our model can be instrumental in enabling Forest Meteorology, 355, 110116.
faster, more targeted disaster response efforts. The https://doi.org/10.1016/J.AGRFORMET.2024.110116
development of a scalable web-based platform further [7]. Huang, S., Ji, J., Wang, Y., Li, W., & Zheng, Y. (2024).
enhances its practical utility, allowing for real-time predictions Development and validation of a soft voting-based model
and seamless integration with existing fire management for urban fire risk prediction. International Journal of
systems. Moreover, this study provides a foundational Disaster Risk Reduction, 101, 104224.
framework for future research and development, emphasizing https://doi.org/10.1016/J.IJDRR.2023.104224
the importance of expanding datasets, incorporating additional [8]. Singh, S. & Jeganathan, C. (2024). Using ensemble
environmental factors, and exploring other machine learning machine learning algorithm to predict forest fire
techniques to improve prediction accuracy. As climate change occurrence probability in Madhya Pradesh and
continues to exacerbate the conditions that lead to wildfires, Chhattisgarh, India. Advances in Space Research, 73(6),
the deployment of such predictive models becomes 2969–2987. https://doi.org/10.1016/j.asr.2023.12.054
increasingly critical in reducing the ecological, economic, and [9]. Wang, S., & Ma, X. (2024). A multi-scale deep learning
human toll of forest fires. Going forward, refining this algorithm for enhanced forest fire danger prediction
approach with real-time data streams and enhancing its using remote sensing images. Forests, 15(9), 1581.
scalability across diverse geographic regions will contribute https://doi.org/10.3390/f15091581
significantly to more effective forest fire prevention and [10]. Rao, S., Wu, Y., Li, C., & Zhu, Z. (2024). Forest fire
mitigation strategies globally. prediction based on time series networks and remote
sensing images. Forests, 15(7), 1221.
REFERENCES https://doi.org/10.3390/f15071221
[11]. Loepfe, L., Martinez-Vilalta, J., & Piñol, J. (2021). An
integrative model of human-influenced fire regimes and
[1]. Surbhi Singh, S., & Jeganathan, C. (2024). Using
landscape dynamics. Environmental Modelling &
ensemble machine learning algorithm to predict forest
Software, 26(4), 1028-1040.
fire occurrence probability in Madhya Pradesh and
https://doi.org/10.1016/j.envsoft.2021.02.015
Chhattisgarh, India. Advances in Space Research, 73(6),
[12]. Rodrigues, M., & de la Riva, J. (2021). Insights into
2969–2987. https://doi.org/10.1016/J.ASR.2023.12.054
machine-learning algorithms to model human-caused
[2]. Pham, V. T., Do, T. A. T., Tran, H. D., & Do, A. N. T.
wildfire occurrence. Environmental Modelling &
(2024). Classifying forest cover and mapping forest fire
Software, 57, 192-201.
susceptibility in Dak Nong province, Vietnam utilizing
https://doi.org/10.1016/j.envsoft.2021.03.003
remote sensing and machine learning. Ecological
[13]. Massada, A. B., Syphard, A. D., Stewart, S. I., &
Informatics, 79, 102392.
Radeloff, V. C. (2022). Wildfire ignition-distribution
https://doi.org/10.1016/J.ECOINF.2023.102392
modelling: A comparative study in the Huron–Manistee
[3]. Shingala, B., Panchal, P., Thakor, S., Jain, P., Joshi, A.,
National Forest, Michigan, USA. International Journal
Vaja, C. R., … Rana, V. A. (2024). Random Forest
of Wildland Fire, 22(2), 174-183.
Regression Analysis for Estimating Dielectric Properties
https://doi.org/10.1071/WF11178
in Epoxy Composites Doped with Hybrid Nano Fillers.
[14]. Saleh, A. Z., Harun, M. A., Haspi, H., et al. (2023).
Journal of Macromolecular Science, Part B, 1–15.
Forest fire surveillance systems: A review of deep
https://doi.org/10.1080/00222348.2024.2322189
learning methods. Heliyon, 9(2), e23127.
[4]. JOUR Forest fire surveillance systems: A review of deep
https://doi.org/10.1016/j.heliyon.2023.e23127
learning methods Saleh, Azlan Zulkifley, Mohd
[15]. Wijayanto, A. K., Sani, O., Kartika, N. D., & Herdiyeni,
AsyrafHarun, Hazimah Haspi Gaudreault, Francis
Y. (2021). Classification model for forest fire hotspot
Davison, Ian Spraggon, Martin 2405-8440 doi:
occurrences prediction using ANFIS algorithm. IOP
10.1016/j.heliyon.2023.e23127
Conference Series: Earth and Environmental Science,
https://doi.org/10.1016/j.heliyon.2023.e23127
54, 012059. https://doi.org/10.1088/1755-
1315/54/1/012059

IJISRT24SEP1290 www.ijisrt.com 2070


Volume 9, Issue 9, September – 2024 International Journal of Innovative Science and Research Technology
ISSN No:-2456-2165 https://doi.org/10.38124/ijisrt/IJISRT24SEP1290

[16]. Chuvieco, E., Aguado, I., Jurdao, S., et al. (2022).


Integrating geospatial information into fire risk
assessment. International Journal of Wildland Fire,
23(6), 606-619. https://doi.org/10.1071/WF12052
[17]. Vasconcelos, M., Silva, S., Tomé, M., Alvim, M., &
Pereira, J. (2021). Spatial prediction of fire ignition
probabilities: Comparing logistic regression and neural
networks. Photogrammetric Engineering & Remote
Sensing, 67(1), 73-81.
https://doi.org/10.14358/PERS.67.1.73
[18]. Hsu, C. W., Chang, C. C., & Lin, C. J. (2021). A
practical guide to support vector classification. Technical
Report, Department of Computer Science and
Information Engineering, University of National Taiwan,
Taipei. https://doi.org/10.1016/j.jhydrol.2021.06.011
[19]. Zhou, Z. H. (2021). Ensemble learning methods for
remote sensing and forest fire prediction. Journal of
Forest Research, 32(3), 203-211.
https://doi.org/10.1007/s10310-021-01301-7
[20]. Duan, R., Yang, F., & Xu, L. (2024). Low complexity
forest fire detection based on improved YOLOv8
network. Forests, 15(9), 1652.
https://doi.org/10.3390/f15091652
[21]. Abid, F., & Izeboudjen, N. (2023). Predicting forest fires
using data mining techniques: A case study from Algeria.
Advances in Intelligent Systems and Computing, 1105,
363–370. https://doi. org/10.1007/978-3-030-51122-8_42
[22]. Hossain, M. M., Al Faruque, M. A., & Basak, R. (2022).
Early forest fire prediction using machine learning
approaches. IEEE Xplore, 25(8), 545-554.
https://doi.org/10.1109/ICRAI56782.2022.00115

IJISRT24SEP1290 www.ijisrt.com 2071

You might also like