Author: Atanassov A.

The importance of Big Data and Big Data Mining is growing significantly in recent years. Different kind of e-sources as social networks, e-commerce sites, e-mails, sensors, etc. are generating large amount of structured and unstructured numerical and text data. This data provides valuable information about costumer’s preferences or ratings of products or commodities. This information is essential for making predictions on the base of the sentiment analysis of this data. The sentiment analysis of large amount of text data requires specific big data and machine learning /ML/ libraries. In this paper the implementation of a system for big data sentiment analysis using ML algorithms is proposed. It is based on Naïve Bayes and Support Vector Machines /SVM/ classification ML algorithms for text analysis. The system is implemented in Java and uses Apache Spark ML libraries which are very flexible, fast and scalable. The system is tested with well known Amazon dataset and its performance is measured in form of accuracy. The obtained results approve the effectiveness of big data sentiment analysis algorithms. The System can be applied for recommendation of products and services or predictions of customers’ needs.

Big data is large volume, heterogeneous, distributed data. Big data applications where data collection has grown continuously, it is expensive to manage, capture or extract and process data using existing software tools. With increasing size of data in data warehouse it is expensive to perform data analysis. In recent years, numbers of computation and data intensive scientific data analyses are established. To perform the large scale data mining analyses so as to meet the scalability and performance requirements of big data, several efficient parallel and concurrent algorithms got applied. For data processing, Big data processing framework relay on cluster computers and parallel execution framework provided by MapReduce. MapReduce is a parallel programming model and an associated implementation for processing and generating large data sets. In this paper, we are going to work around MapReduce, use a MapReduce solution for handling large data efficiently, its advantages, disadvantages and how it can be used in integration with other technology.

This paper defines constrained functional similarity between 2-D trajectories via minimizing the H1 semi-norm of the difference between the trajectories. An exact general solution is obtained for the case wherein the components of the trajectories are mesh- functions defined on a uniform mesh and the imposed constraints are linear. Various examples are presented, one of which features application to mechanics and two-point boundary value problems. A MATLAB code is given for the solution of one of the examples. The code could easily be adjusted to other cases.

The paper presents the comparison of two Case-Based Reasoning (CBR) oriented software frameworks myCBR3 Workbench and CBR-Works ver. 4.3.0 for the development of predictive diagnosis and maintenance systems. Those frameworks were selected after detailed preliminary comparisons of previous versions of myCBR presented in, as well of the investigations of the capabilities of other 4 popular CBR software systems. The evaluation of myCBR and CBR-Works includes the capacity to support the: R CBR circle; clusterization of cases, variety of used similarity functions, etc. Specific abilities to provide GUI, database support, required knowledge to work with the systems were also considered.

Everyday millions of letters are redirected from old to new addresses of their recipients. It happens because people are changing their homes or job, so the companies changing their location. In order to avoid sending the letters to old and then to the new addresses, which spends lots of recourses, a new generation of Postal Automated Redirection System /PARS/ is applied in USPS. The PARS consists of more than 50 processing sorting centers /P&DC/ equipped with special optical character recognition SW which automatically redirects the letters to the new addresses. This paper presents the design and development of WEB-based Performance Diagnostic System /PDS/ intended to operative control, monitoring and diagnostics of PARS HW and SW components, as well diagnostics and analysis of sorted letters. The PDS can be used for on-line remote control of PARS and the quality and performance of the sorting process.