1.1what Is Data Mining?: Gallop
1.1what Is Data Mining?: Gallop
1.1what Is Data Mining?: Gallop
Generally, data mining (sometimes called data or knowledge discovery) is the process of analyzing
data from different perspectives and summarizing it into useful information - information that can
be used to increase revenue, cuts costs, or both. Data mining software is one of a number of
analytical tools for analyzing data. It allows users to analyze data from many different dimensions
or angles, categorize it, and summarize the relationships identified. Technically, data mining is the
process of finding correlations or patterns among dozens of fields in large relational databases.
While large-scale information technology has been evolving separate transaction and analytical
systems, data mining provides the link between the two. Data mining software analyzes
relationships and patterns in stored transaction data based on open-ended user queries. Several
types of analytical software are available: statistical, machine learning, and neural networks.
Classes: Stored data is used to locate data in predetermined groups. For example, a
restaurant chain could mine customer purchase data to determine when customers visit and
what they typically order. This information could be used to increase traffic by having daily
Sequential patterns: Data is mined to anticipate behavior patterns and trends. For
example, an outdoor equipment retailer could predict the likelihood of a backpack being
purchased based on a consumer's purchase of sleeping bags and hiking shoes.
1) Extract, transform, and load transaction data onto the data warehouse system.
2) Store and manage the data in a multidimensional database system.
3) Provide data access to business analysts and information technology professionals.
4) Analyze the data by application software.
5) Present the data in a useful format, such as a graph or table.
Artificial neural networks: Non-linear predictive models that learn through training and
resemble biological neural networks in structure.
Decision trees: Tree-shaped structures that represent sets of decisions. These decisions
generate rules for the classification of a dataset. Specific decision tree methods include
Classification and Regression Trees (CART) and Chi Square Automatic Interaction
Detection (CHAID). CART and CHAID are decision tree techniques used for classification
of a dataset. They provide a set of rules that you can apply to a new (unclassified) dataset
to predict which records will have a given outcome. CART segments a dataset by creating
2-way splits while CHAID segments using chi square tests to create multi-way splits.
CART typically requires less data preparation than CHAID.
Nearest neighbor method: A technique that classifies each record in a dataset based on a
combination of the classes of the k record(s) most similar to it in a historical dataset
(where k=1). Sometimes called the k-nearest neighbor technique.
Rule induction: The extraction of useful if-then rules from data based on statistical
Large quantities of data: The volume of data so great it has to be analyzed by automated
techniques e.g. satellite information, credit card transactions etc.
Noisy, incomplete data: Imprecise data is the characteristic of all data collection.
Complex data structure: conventional statistical analysis not possible
Heterogeneous data stored in legacy systems
1) It’s one of the most effective services that are available today. With the help of data mining,
one can discover precious information about the customers and their behavior for a specific
set of products and evaluate and analyze, store, mine and load data related to them
2) An analytical CRM model and strategic business related decisions can be made with the
help of data mining as it helps in providing a complete synopsis of customers
3) An endless number of organizations have installed data mining projects and it has helped
them see their own companies make an unprecedented improvement in their marketing
strategies (Campaigns)
4) Data mining is generally used by organizations with a solid customer focus. For its flexible
nature as far as applicability is concerned is being used vehemently in applications to
foresee crucial data including industry analysis and consumer buying behaviors
5) Fast paced and prompt access to data along with economic processing techniques have
made data mining one of the most suitable services that a company seek
1. Marketing / Retail:
Data mining helps marketing companies build models based on historical data to predict who
will respond to the new marketing campaigns such as direct mail, online marketing campaign…etc.
Through the results, marketers will have appropriate approach to sell profitable products to
targeted customers.
Data mining brings a lot of benefits to retail companies in the same way as marketing. Through
market basket analysis, a store can have an appropriate production arrangement in a way that
customers can buy frequent buying products together with pleasant. In addition, it also helps the
retail companies offer certain discounts for particular products that will attract more customers.
2. Finance / Banking
Data mining gives financial institutions information about loan information and credit
reporting. By building a model from historical customer’s data, the bank and financial institution
can determine good and bad loans. In addition, data mining helps banks detect fraudulent credit
card transactions to protect credit card’s owner.
3. Manufacturing
By applying data mining in operational engineering data, manufacturers can detect faulty
equipments and determine optimal control parameters. For example semi-conductor manufacturers
has a challenge that even the conditions of manufacturing environments at different wafer
production plants are similar, the quality of wafer are lot the same and some for unknown reasons
even has defects. Data mining has been applying to determine the ranges of control parameters
that lead to the production of golden wafer. Then those optimal control parameters are used to
manufacture wafers with desired quality.
4. Governments
Data mining helps government agency by digging and analyzing records of financial
transaction to build patterns that can detect money laundering or criminal activities.
5. Law enforcement:
Data mining can aid law enforcers in identifying criminal suspects as well as apprehending
these criminals by examining trends in location, crime type, habit, and other patterns of behaviors.
6. Researchers:
Data mining can assist researchers by speeding up their data analyzing process; thus, allowing
those more time to work on other projects.
Location-based social networks (LBSNs) offer researchers rich data to study people's online
activities and mobility patterns. One important application of such studies is to provide
personalized point-of-interest (POI) recommendations to enhance user experience in LBSNs.
Previous solutions directly predict users' preference on locations but fail to provide insights about
users' preference transitions among locations. In this work, we propose a novel category-aware
POI recommendation model, which exploits the transition patterns of users' preference over
location categories to improve location recommendation accuracy. Our approach consists of two
stages: (1) preference transition (over location categories) prediction, and (2) category-aware POI
recommendation. Matrix factorization is employed to predict a user's preference transitions over
categories and then her preference on locations in the corresponding categories. Real data based
experiments demonstrate that our approach outperforms the state-of-the-art POI recommendation
models by at least 39.75% in terms of recall.
With the rapid development of location-based social networks (LBSNs), spatial item
recommendation has become an important way of helping users discover interesting locations to
increase their engagement with location-based services. Although human movement exhibits
sequential patterns in LBSNs, most current studies on spatial item recommendations do not
consider the sequential influence of locations. Leveraging sequential patterns in spatial item
recommendation is, however, very challenging, considering 1) users' check-in data in LBSNs has
a low sampling rate in both space and time, which renders existing prediction techniques on GPS
trajectories ineffective; 2) the prediction space is extremely large, with millions of distinct
locations as the next prediction target, which impedes the application of classical Markov chain
models; and 3) there is no existing framework that unifies users' personal interests and the
sequential influence in a principled manner. In light of the above challenges, we propose a
sequential personalized spatial item recommendation framework (SPORE) which introduces a
novel latent variable topic-region to model and fuse sequential influence with personal interests in
the latent and exponential space. The advantages of modeling the sequential effect at the topic-
region level include a significantly reduced prediction space, an effective alleviation of data
sparsity and a direct expression of the semantic meaning of users' spatial activities. Furthermore,
we design an asymmetric Locality Sensitive Hashing (ALSH) technique to speed up the online
top-k recommendation process by extending the traditional LSH. We evaluate the performance of
SPORE on two real datasets and one large-scale synthetic dataset. The results demonstrate a
significant improvement in SPORE's ability to recommend spatial items, in terms of both
effectiveness and efficiency, compared with the state-of-the-art methods.
With the growing popularity of location-based social networks, numerous location visiting records
(e.g., check-ins) continue to accumulate over time. The more these records are collected, the better
we can understand users’ mobility patterns and the more accurately we can predict their future
locations. However, due to the personality trait of neophilia, people also show propensities of
novelty seeking in human mobility that is, exploring unvisited but tailored locations for them to
visit. As such, the existing prediction algorithms, mainly relying on regular mobility patterns, face
severe challenges because such behavior is beyond the reach of regularity. As a matter of fact, the
prediction of this behavior not only relies on the forecast of novelty-seeking tendency but also
depends on how to determine unvisited candidate locations. To this end, we put forward a
Collaborative Exploration and Periodically Returning model (CEPR), based on a novel problem,
Exploration Prediction (EP), which forecasts whether people will seek unvisited locations to visit,
in the following. When people are predicted to do exploration, a state-of-the-art recommendation
algorithm, armed with collaborative social knowledge and assisted by geographical influence, will
be applied for seeking the suitable candidates; otherwise, a traditional prediction algorithm,
incorporating both regularity and the Markov model, will be put into use for figuring out the most
possible locations to visit. We then perform case studies on check-ins and evaluate them on two
large-scale check-in datasets with 6M and 36M records, respectively. The evaluation results show
that EP achieves a roughly 20% classification error rate on both datasets, greatly outperforming
the baselines, and that CEPR improves performances by as much as 30% compared to the
traditional location prediction algorithms.
Even though human movement and mobility patterns have a high degree of freedom and variation,
they also exhibit structural patterns due to geographic and social constraints. Using cell phone
location data, as well as data from two online location-based social networks, we aim to understand
what basic laws govern human motion and dynamics. We find that humans experience a
combination of periodic movement that is geographically limited and seemingly random jumps
correlated with their social networks. Short-ranged travel is periodic both spatially and temporally
and not affected by the social network structure, while long-distance travel is more influenced by
social network ties. We show that social relationships can explain about 10 % to 30 % of all human
movement, while periodic behavior explains 50 % to 70%. Based on our findings, we develop a
model of human mobility that combines periodic short range movements with travel due to the
social network structure. We show that our model reliably predicts the locations and dynamics of
future human movement and gives an order of magnitude better performance than present models
of human mobility.
In this paper, we propose a new feature fusion approach, i.e., GlobAL feature fusion for
LOcation Prediction (GALLOP),to cope with the variety problem in location prediction.
To improve the applicability of location prediction approach,
We utilize several kinds of features and discuss their different characteristics in the variety
of check-in scenarios. Three classes of features are used in GALLOP: context feature
(geographical aspects), collaboration feature (users’ latent interest space) and content
feature (places’ description attributes).
We introduce intuitive ways to model these check-in features and then formalize a
combination framework to deliver the predicted target places to end users.
First, we investigate the difference of several representative check-in scenarios from the
spatial and temporal aspects. We argue that these varying check-in scenarios ask for more
general location prediction methods.
Second, we demonstrate the combination power of location features in a novel angle. We
not only utilize the different classes of context, collaboration and content information, but
also factorize them in a new way to improve the prediction robustness and generality.
The proposed GALLOP prediction approach is not only general over different check-in
scenarios but also comprehensive of different features. In the context feature, we design a
multiple granularity model to profile the geographical closeness.
We select the predicted candidates based on the combination of local district, local city and
state scales. The weights of each scale are learned from training data.
At last, the extensive study over several real datasets reveals the improvement and
advantage of our approach. We provide an empirical study with several competition
Detailed experiments show the different behaviors of the prediction methods, and prove
that the general location prediction approach is a better choice to tackle the location
prediction challenges.
The feasibility of the project is analyzed in this phase and business proposal is put
forth with a very general plan for the project and some cost estimates. During system analysis
the feasibility study of the proposed system is to be carried out. This is to ensure that the
proposed system is not a burden to the company. For feasibility analysis, some
understanding of the major requirements for the system is essential.
This study is carried out to check the economic impact that the system will have on the
organization. The amount of fund that the company can pour into the research and development of
the system is limited. The expenditures must be justified. Thus the developed system as well within
the budget and this was achieved because most of the technologies used are freely available. Only
the customized products had to be purchased.
This study is carried out to check the technical feasibility, that is, the technical
requirements of the system. Any system developed must not have a high demand on the available
technical resources. This will lead to high demands on the available technical resources. This will
lead to high demands being placed on the client. The developed system must have a modest
requirement, as only minimal or null changes are required for implementing this system.
The aspect of study is to check the level of acceptance of the system by the user. This
includes the process of training the user to use the system efficiently. The user must not feel
threatened by the system, instead must accept it as a necessity. The level of acceptance by the users
solely depends on the methods that are employed to educate the user about the system and to make
him familiar with it. His level of confidence must be raised so that he is also able to make some
constructive criticism, which is welcomed, as he is the final user of the system.
The Primary goals in the design of the UML are as follows:
1. Provide users a ready-to-use, expressive visual modeling Language so that they can
develop and exchange meaningful models.
2. Provide extendibility and specialization mechanisms to extend the core concepts.
3. Be independent of particular programming languages and development process.
4. Provide a formal basis for understanding the modeling language.
5. Encourage the growth of OO tools market.
6. Support higher level development concepts such as collaborations, frameworks, patterns
and components.
7. Integrate best practices.
Register Login
Update Status
User’s Mobility Logs
Find Friends
View Friend
Requests& Friends
User’s Check In
Update Status
Find Friends
Send Request
View Friend
Requests& Friends
User’s Mobility
In software engineering, a class diagram in the Unified Modeling Language (UML) is a type of
static structure diagram that describes the structure of a system by showing the system's classes,
their attributes, operations (or methods), and the relationships among the classes. It explains which
class contains information.
Register Login
A sequence diagram in Unified Modeling Language (UML) is a kind of interaction diagram that
shows how processes operate with one another and in what order. It is a construct of a Message
Sequence Chart. Sequence diagrams are sometimes called event diagrams, event scenarios, and
timing diagrams.
View User Details
View profile
Predict Users Next Location
Users Check In
Find Friends
Response to request
Activity diagrams are graphical representations of workflows of stepwise activities and actions
with support for choice, iteration and concurrency. In the Unified Modeling Language, activity
diagrams can be used to describe the business and operational step-by-step workflows of
components in a system. An activity diagram shows the overall flow of control.
Register Login
Update Status
User’s Mobility
Find Friends
Predict Users
Send Request Next Location
View Friend
User’s Check
The input design is the link between the information system and the user. It comprises the
developing specification and procedures for data preparation and those steps are necessary to put
transaction data in to a usable form for processing can be achieved by inspecting the computer to
read data from a written or printed document or it can occur by having people keying the data
directly into the system. The design of input focuses on controlling the amount of input required,
controlling the errors, avoiding delay, avoiding extra steps and keeping the process simple. The
input is designed in such a way so that it provides security and ease of use with retaining the
privacy. Input Design considered the following things:
1. Input Design is the process of converting a user-oriented description of the input into a
computer-based system. This design is important to avoid errors in the data input process and show
the correct direction to the management for getting correct information from the computerized
2. It is achieved by creating user-friendly screens for the data entry to handle large volume of data.
The goal of designing input is to make data entry easier and to be free from errors. The data entry
screen is designed in such a way that all the data manipulates can be performed. It also provides
record viewing facilities.
3. When the data is entered it will check for its validity. Data can be entered with the help of
screens. Appropriate messages are provided as when needed so that the user will not be in maize
of instant. Thus the objective of input design is to create an input layout that is easy to follow
A quality output is one, which meets the requirements of the end user and presents the information
clearly. In any system results of processing are communicated to the users and to other system
through outputs. In output design it is determined how the information is to be displaced for
immediate need and also the hard copy output. It is the most important and direct source
information to the user. Efficient and intelligent output design improves the system’s relationship
to help user decision-making.
1. Designing computer output should proceed in an organized, well thought out manner; the right
output must be developed while ensuring that each output element is designed so that people will
find the system can use easily and effectively. When analysis design computer output, they should
Identify the specific output that is needed to meet the requirements.
3. Create document, report, or other formats that contain information produced by the system.
The output form of an information system should accomplish one or more of the following
Software Environment
Microsoft .NET is a set of Microsoft software technologies for rapidly building and
integrating XML Web services, Microsoft Windows-based applications, and Web solutions. The
.NET Framework is a language-neutral platform for writing programs that can easily and securely
interoperate. There’s no language barrier with .NET: there are numerous languages available to
the developer including Managed C++, C#, Visual Basic and Java Script. The .NET framework
provides the foundation for components to interact seamlessly, whether locally or remotely on
different platforms. It standardizes common data types and communications protocols so that
components created in different languages can easily interoperate.
“.NET” is also the collective name given to various software components built
upon the .NET platform. These will be both products (Visual Studio.NET and Windows.NET Server,
for instance) and services (like Passport, .NET My Services, and so on).
The CLR is described as the “execution engine” of .NET. It provides the environment within which
programs run. The most important features are
Managed Code
The code that targets .NET, and which contains certain extra
Information - “metadata” - to describe itself. Whilst both managed and unmanaged code can run
in the runtime, only managed code contains the information that allows the CLR to guarantee,
for instance, safe execution and interoperability.
Managed Data
With Managed Code comes Managed Data. CLR provides memory allocation and
Deal location facilities, and garbage collection. Some .NET languages use Managed Data by
default, such as C#, Visual Basic.NET and JScript.NET, whereas others, namely C++, do not.
Targeting CLR can, depending on the language you’re using, impose certain constraints on the
features available. As with managed and unmanaged code, one can have both managed and
unmanaged data in .NET applications - data that doesn’t get garbage collected but instead is
looked after by unmanaged code.
The CLR uses something called the Common Type System (CTS) to strictly enforce type-
safety. This ensures that all classes are compatible with each other, by describing types in a
common way. CTS define how types work within the runtime, which enables types in one
language to interoperate with types in another language, including cross-language exception
handling. As well as ensuring that types are only used in appropriate ways, the runtime also
ensures that code doesn’t attempt to access memory that hasn’t been allocated to it.
The CLR provides built-in support for language interoperability. To ensure that you can
develop managed code that can be fully used by developers using any programming language, a
set of language features and rules for using them called the Common Language Specification
(CLS) has been defined. Components that follow these rules and expose only CLS features are
considered CLS-compliant.
.NET provides a single-rooted hierarchy of classes, containing over 7000 types. The
root of the namespace is called System; this contains basic types like Byte, Double, Boolean, and
String, as well as Object. All objects derive from System. Object. As well as objects, there are
value types. Value types can be allocated on the stack, which can provide useful flexibility. There
are also efficient means of converting value types to object types if and when necessary.
The set of classes is pretty comprehensive, providing collections, file, screen, and
network I/O, threading, and so on, as well as XML and database connectivity.
The class library is subdivided into a number of sets (or namespaces), each
providing distinct areas of functionality, with dependencies between the namespaces kept to a
The multi-language capability of the .NET Framework and Visual Studio .NET
enables developers to use their existing programming skills to build all types of applications and
XML Web services. The .NET framework supports new versions of Microsoft’s old favorites Visual
Basic and C++ (as VB.NET and Managed C++), but there are also a number of new additions to
the family.
Visual Basic .NET has been updated to include many new and improved language
features that make it a powerful object-oriented programming language. These features include
inheritance, interfaces, and overloading, among others. Visual Basic also now supports structured
exception handling, custom attributes and also supports multi-threading.
Visual Basic .NET is also CLS compliant, which means that any CLS-compliant
language can use the classes, objects, and components you create in Visual Basic .NET.
Managed Extensions for C++ and attributed programming are just some of the
enhancements made to the C++ language. Managed Extensions simplify the task of migrating
existing C++ applications to the new .NET Framework.
C# is Microsoft’s new language. It’s a C-style language that is essentially “C++ for
Rapid Application Development”. Unlike other languages, its specification is just the grammar of
the language. It has no standard library of its own, and instead has been designed with the
intention of using the .NET libraries as its own.
Active State has created Visual Perl and Visual Python, which enable .NET-aware
applications to be built in either Perl or Python. Both products can be integrated into the Visual
Studio .NET environment. Visual Perl includes support for Active State’s Perl Dev Kit.
Operating System
C#.NET is also compliant with CLS (Common Language Specification) and supports
structured exception handling. CLS is set of rules and constructs that are supported by the
CLR (Common Language Runtime). CLR is the runtime environment provided by the .NET
Framework; it manages the execution of the code and also makes the development process
easier by providing services.
Constructors are used to initialize objects, whereas destructors are used to destroy them.
In other words, destructors are used to release the resources allocated to the object. In
C#.NET the sub finalize procedure is available. The sub finalize procedure is used to
complete the tasks that must be performed when an object is destroyed. The sub finalize
procedure is called automatically when an object is destroyed. In addition, the sub finalize
procedure can be called only from the class it belongs to or from derived classes.
Garbage Collection is another new feature in C#.NET. The .NET Framework monitors
allocated resources, such as objects and variables. In addition, the .NET Framework
automatically releases memory for reuse by destroying objects that are no longer in use.
In C#.NET, the garbage collector checks for the objects that are not currently in use by
applications. When the garbage collector comes across an object that is marked for garbage
collection, it releases the memory occupied by the object.
C#.NET also supports multithreading. An application that supports multithreading can handle
multiple tasks simultaneously, we can use multithreading to decrease the time taken by an
application to respond to user interaction.
There are different types of application, such as Windows-based applications and Web-based
The OLAP Services feature available in SQL Server version 7.0 is now
called SQL Server 2000 Analysis Services. The term OLAP Services has been replaced
with the term Analysis Services. Analysis Services also includes a new data mining
component. The Repository component available in SQL Server version 7.0 is now
called Microsoft SQL Server 2000 Meta Data Services. References to the
component now use the term Meta Data Services. The term repository is used only
in reference to the repository engine within Meta Data Services
They are,
1. Design View
2. Datasheet View
Design View
Datasheet View
A query is a question that has to be asked the data. Access gathers data that
answers the question from one or more table. The data that make up the answer
is either dynaset (if you edit it) or a snapshot (it cannot be edited).Each time we run
query, we get latest information in the dynaset. Access either displays the dynaset
or snapshot for us to view or perform an action on it, such as deleting or updating.
Context Feature
Collaboration Feature
Content Feature
Context Feature:
It refers to the spatial dimension. Users’ check-in activities are distributed in a spatial
scope. Nearby places can contribute to the representation of users’ check-in records,
especially when the users visit a focused set of places. Density closeness of users’
check-in logs from the spatial perspective has received a lot of attention. It has shown
advantage over traditional spatial modeling, both in flexibility and accuracy. We
follow similar motivation and provide our own design to extract users’ preference
from the spatial aspects.
Collaboration Feature:
performance are two-folds. First, the check-in matrix R is very sparse, i.e., it has
many zero items. Zero check-in record means that the user never visits the
corresponding place. This is probably because he/she is not a fan of that place or
his/her location scope is rather focused. Second, for some rated places, the frequency
information is not enough. In the counterpart representation of movie or product
recommendation, users not only reveal their favorite items with high ratings, but also
their least favorite items with low rating. However, this situation does not hold in
the case of location prediction, frequency of check-ins merely imply the confident
level of users’ preference for the corresponding location, which makes them barely
serve as explicit ratings for locations given from users.
Content Feature:
It refers to the place dimension. Places are not merely visited by users. These places
have inherent attributes, i.e., categories, text descriptions and other kinds of
annotations. We discuss how the transition between places reveals users’
interest/preference over time. In further, the transition patterns benefit the closeness
extraction of places for better prediction. In this paper, we use the categories of the
location as its attribute description. We discuss a general way to obtain the content
description features. POI, i.e., places in check-in records are usually annotated with
categories or attributes. Users’ transition pattern from one place to another one
shows the interest flow between these places. The extracted closeness from these
transition patterns can be used to predict the potential locations that user will take a
visit next.
<html xmlns="http://www.w3.org/1999/xhtml">
<head runat="server">
<meta charset="utf-8">
<!-- HTML5 shim and Respond.js for IE8 support of HTML5 elements and media queries -->
<!-- WARNING: Respond.js doesn't work if you view the page via file:// -->
<!--[if lt IE 9]>
<script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
<script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
<div class="container">
<!-- Brand and toggle get grouped for better mobile display -->
<div class="navbar-header">
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<span class="icon-bar"></span>
<a class="navbar-brand"
href=""><span> GALLOP: </span><span style="font-
size:25px">GlobAL feature fused LOcation Prediction for Different Check-in Scenarios</span></a>
<div class="menu">
<li role="presentation"><a
<li role="presentation"><a
<li role="presentation"><a
<div class="container">
<div class="row">
<div class="service">
<div class="">
<div class="">
<div class="last-div">
<div class="container">
<div class="row">
<div class="copyright">
<a target="_blank"
You can buy this theme without footer links online at:
<div class="container">
<div class="row">
<ul class="social-network">
<script src="js/jquery-2.1.1.min.js"></script>
<!-- Include all compiled plugins (below), or include individual files as needed -->
<script src="js/bootstrap.min.js"></script>
<script src="js/wow.min.js"></script>
<script src="js/jquery.easing.1.3.js"></script>
<script src="js/jquery.isotope.min.js"></script>
<script src="js/jquery.bxslider.min.js"></script>
<script src="js/functions.js"></script>
<script type="text/javascript">$('.portfolio').flipLightBox()</script>
Software system meets its requirements and user expectations and does not fail in
an unacceptable manner. There are various types of test. Each test type addresses a
specific testing requirement.
Unit testing
Unit testing involves the design of test cases that validate that the internal
program logic is functioning properly, and that program inputs produce valid
outputs. All decision branches and internal code flow should be validated. It is the
testing of individual software units of the application .it is done after the completion
of an individual unit before integration. This is a structural testing, that relies on
knowledge of its construction and is invasive. Unit tests perform basic tests at
component level and test a specific business process, application, and/or system
configuration. Unit tests ensure that each unique path of a business process performs
accurately to the documented specifications and contains clearly defined inputs and
expected results.
Integration testing
Functional test
System Test
System testing ensures that the entire integrated software system meets
requirements. It tests a configuration to ensure known and predictable results. An
example of system testing is the configuration oriented system integration test.
System testing is based on process descriptions and flows, emphasizing pre-driven
process links and integration points.
document. It is a testing in which the software under test is treated, as a black box
.you cannot “see” into it. The test provides inputs and responds to outputs without
considering how the software works.
Unit testing is usually conducted as part of a combined code and unit test
phase of the software lifecycle, although it is not uncommon for coding and unit
testing to be conducted as two distinct phases.
Test objectives
All field entries must work properly.
Pages must be activated from the identified link.
The entry screen, messages and responses must not be delayed.
Features to be tested
Verify that the entries are of the correct format
No duplicate entries should be allowed
All links should take the user to the correct page.
Test Results: All the test cases mentioned above passed successfully. No defects
Test Results: All the test cases mentioned above passed successfully. No defects
This paper presents a new feature fusion method for location prediction problem.
We systematically analyze the check-in characteristics of different scenarios and
propose to model three categories of features and combine them in a global way.
The geographical, collaborative and categorical information are all utilized. We
propose new models to include more global features to improve the generality and
robustness of the prediction method. Besides, the approach is versatile and easy to
extend. It shows impressive advantage on different datasets and significantly
improves the prediction accuracy.
This research has several interesting future directions. For example, better ways to
improve the feature preprocessing stage and design the compact structure to
maintain the extracted features. It is also valuable to exploit the evolving factors in
the location prediction. Additionally, the feature extraction methods we proposed in
this work can be extended to enable incremental updating. And new comprehensive
location prediction and update setting can be utilized.
[1] J. Sang, T. Mei, J.-T. Sun, C. Xu, and S. Li, “Probabilistic sequential pois
recommendation via check-in data,” in Proceedings of the 20th International
Conference on Advances in Geographic Information Systems, 2012, pp. 402–405.
[3] J.-D. Zhang, C.-Y. Chow, and Y. Li, “Lore: exploiting sequential influence for
location recommendations,” in Proceedings of the 22nd ACM SIGSPATIAL
International Conference on Advances in Geographic Information Systems, 2014,
pp. 103–112.
[6] A. Noulas, S. Scellato, N. Lathia, and C. Mascolo, “Mining user mobility features
for next place prediction in location-based services,” in Proc. of ICDM. IEEE, 2012,
pp. 1038–1043.
[7] J. Ye, Z. Zhu, and H. Cheng, “Whats your next move: User activity prediction in
location-based social networks,” in Proc. of SDM, 2013.
[9] E. Cho, S. A. Myers, and J. Leskovec, “Friendship and mobility: user movement
in location-based social networks,” in Proc. Of SIGKDD, 2011, pp. 1082–1090.
[11] D. Lian, X. Xie, F. Zhang, N. J. Yuan, T. Zhou, and Y. Rui, “Mining location-
based social networks: A predictive perspective,” IEEE Data Engineering bulletin,
vol. 38, no. 2, pp. 35–46, 2015.
[12] D. Brockmann, L. Hufnagel, and T. Geisel, “The scaling laws of human travel,”
Nature, vol. 439, no. 7075, pp. 462–465, 2006.
[14] D. Lian, C. Zhao, X. Xie, G. Sun, E. Chen, and Y. Rui, “GeoMF: Joint
geographical modeling and matrix factorization for point-of interest
recommendation,” in Proc. of SIGKDD, 2014, pp. 831–840.
[15] M. Lichman and P. Smyth, “Modeling human location data with mixtures of
kernel densities,” in Proc. of SIGKDD, 2014, pp. 35–44.
[16] J.-D. Zhang and C.-Y. Chow, “GeoSoCa: Exploiting geographical, social and
categorical correlations for point-of-interest recommendations,” in Proc. of SIGIR,
2015, pp. 443–452.
[17] Y. Wang, N. J. Yuan, D. Lian, L. Xu, X. Xie, E. Chen, and Y. Rui, “Regularity
and conformity: Location prediction using heterogeneous mobility data,” in Proc. of
SIGKDD, 2015, pp. 1275–1284.
[18] B. W. Silverman, Density estimation for statistics and data analysis. CRC press,
[19] R. A. Finkel and J. L. Bentley, “Quad trees a data structure for retrieval on
composite keys,” Acta informatica, vol. 4, no. 1, pp. 1–9, 1974.
[25] M. Li, A. Ahmed, and A. J. Smola, “Inferring movement trajectories from GPS
snippets,” in Proc. of WSDM, 2015, pp. 325–334.
[26] H. Yin, X. Zhou, Y. Shao, H. Wang, and S. Sadiq, “Joint modeling of user
check-in behaviors for point-of-interest recommendation,” in Proc. of CIKM, 2015,
pp. 1631–1640.
[28] J. Lee, J. Han, and X. Li, “A unifying framework of mining trajectory patterns
of various temporal tightness,” IEEE Trans. Knowl. Data Eng., vol. 27, no. 6, pp.
1478–1490, 2015.
[33] C. Zhang, J. Han, L. Shou, J. Lu, and T. La Porta, “Splitter: Mining fine-grained
sequential patterns in semantic trajectories,” PVLDB, vol. 7, no. 9, 2014.
[35] J. McGee, J. Caverlee, and Z. Cheng, “Location prediction in social media based
on tie strength,” in Proc. of CIKM, 2013, pp. 459–468.
[38] H. Jeung, Q. Liu, H. T. Shen, and X. Zhou, “A hybrid prediction model for
moving objects,” in Proc. of ICDE, 2008, pp. 70–79.
[39] E. Horvitz and J. Krumm, “Some help on the way: Opportunistic routing under
uncertainty,” in Proc. of UbiComp, 2012, pp. 371–380.