Saturday, February 24, 2018





Currently, the Internet provides a large number of data points that can be overwhelming, therefore, there is need to filter, prioritize and deliver pertinent information in a way to lessen the problem of information overload, which, has created a potential problem for many data managers.
In order to obtain customized data content recommender—systems can solve this problem by searching through large volume of dynamically generated information. There are specific characteristics and different prediction techniques in recommendation systems as a guide for research and practice in the field of recommendation systems.
The exponential growth in the amount of digital information created by the vast number of visitors to the Internet have resulted in a challenge of information burdens which delays personalized access to the items of interest on the Internet. Existing information retrieval systems, such as Google, DevilFinder and Altavista have partially solved this problem but arranging and personalizing content to user’s interests of unique information is not available. This has provoked demand for recommender—systems more than ever before. In essence, recommender—systems are information filters that tackle the problem of information overload by filtering selective information extracted out of large amounts of dynamically generated information related to user’s preferences, interest, or observed behavior about items, events or texts. As such, recommender—system can predict whether a user would prefer an item not based on a profile.
Recommender—systems are beneficial to service providers as well as users by reducing the transaction costs of finding and selecting items online like in a shopping environment. Recommendation—systems are also proven to make improved decision processes of quantity and quality of searches. For instance, in e-commerce a recommender—system can improve revenues, simply because they are more effective means of selling products. In other field’s libraries, recommender—systems help users by permitting them to move beyond catalog searches. In this light, the need to use precise recommendation techniques within a system that can provide pertinent and reliable recommendations for users is of utmost importance.
Various approaches for structuring recommender—systems have been developed, that can make use of collaborative filtering content-based or hybrid filtering. Collaborative filtering technique is the most well-known and the most commonly implemented. Collaborative filtering recommends selects and identify items other users with similar taste have chosen and it uses their opinion to recommend items to a searching user. Collaborative recommender—systems have been implemented in different application areas. As an example, the Social Computing Research at the University of Minnesota “GroupLens” is a news-based database design which uses collaborative methods for assisting a researcher find articles from a massive news database construct.
Another example is Amazon which makes use of a topic diversification algorithm to advance its recommendation. The system uses a collaborative filtering method by generating a matrix of similar items offline through the use of item-to-item connection in the matrix. The system also recommends other products which are similar in accordance with a users’ purchase profile.
Here is a diagram and a rating matrix produced by:
Rating Matrix
“Collaborative filtering (CF) in the newer sense, is a method of making automatic predictions (filtering) about the interests of a user by collecting preferences or taste information from many users (collaborating). The underlying assumption of the collaborative filtering approach is that if a person A has the same opinion as a person B on an issue, A is more likely to have B's opinion on a different issue than that of a randomly chosen person.”
“Memory-based CF can be achieved in two ways through user-based and item-based techniques. User based collaborative filtering technique calculates similarity between users by comparing their ratings on the same item, and it then computes the predicted rating for an item by the active user as a weighted average of the ratings of the item by users similar to the active user where weights are the similarities of these users with the target item.”
Models are developed using different data mining, machine-learning-algorithms in order to predict users' rating of unrated items. Among these model are: Bayesian analysis, clustering models, latent semantic models such as singular value decomposition matrix (SVD), probabilistic latent semantic analysis, multiple multiplicative factor, latent Dirichlet allocation and Markov chains decision process based models as the most commonly used machine learning algorithms.
Dimensionality reduction methods are used as a complementary procedure to improve sturdiness and accuracy of memory-based approach. Methods like singular value decomposition (SVD), principle component analysis (PCA) are both known as latent factor models, which compress a user’s-item matrix into a low-dimensionality representation of latent factors in the data.
Memory-based
This Technique is applied in user rating data to compute the similarity between users or items.
Model-based
In this method, models are developed using different algorithms for data mining and machine learning to predict ratings of unrated items.
Hybrid
A number of applications combine the memory-based and the model-based CF algorithms to overcome the limitations of both approaches and to improve prediction performance. Nevertheless, they have increased complexity and are expensive to implement. They are mainly used commercially, such as Google news recommender—system.
Current Algorithms in Use:
· Memory-based algorithms
· Model-based algorithms
· Item-based collaborative filtering
· Personality Diagnosis
· Single Value Decomposition (SVD)
· Principle Component Analysis (PCA)
· Association Rules
Commonly there are two similarity measurements the Pearson correlation coefficient and the other is called vector similarity. These two measures are identified by Breese, D.Heckerman, and C.Kadie in their work“Empirical analysis of predictive algorithms for collaborative filtering.”
The quality of predictions ar good but It uses the entire database every time it makes a prediction, and so, it depends on memory availability which could make it very slow.
Artificial Neural network (ANN) is an architecture of many connected neurons or nodes which are arranged in layers in methodical ways. The connections between neurons have weights depending on the amount of influence one neuron or node has on another. There are some advantages in using neural networks in some special problem situations.
Neuroph Studio is free Java Neural Network development environment on top of NetBeans Platform based on Neuroph framework. It has been licensed under the Common Development and Distribution License (CDDL). http://neuroph.sourceforge.net/d...

No comments:

Post a Comment