Petanux - Intelligence Meets Innovation

Publications

Exploring the Frontier of AI: Petanux’s Contributions and Innovations

The Research and Innovation Department at our AI company is the cornerstone of our technological advancements and groundbreaking discoveries. Dedicated to pushing the boundaries of artificial intelligence, this department spearheads numerous funded projects that explore cutting-edge AI applications and methodologies. Our team of experts consistently publishes scientific papers in prestigious journals, contributing valuable knowledge to the global AI community. By translating these scientific achievements into practical, industrial solutions, we ensure that our innovations are not only theoretical but also highly applicable in real-world scenarios. This department fosters a collaborative environment, partnering with leading academic institutions and industry leaders to stay at the forefront of AI research. 

image
image

Through rigorous experimentation and continuous learning, the Research and Innovation Department drives our mission to transform industries and improve lives with state-of-the-art AI technologies. Petanux has a highly skilled team of AI experts who have deep knowledge and experience in a variety of AI fields, including machine learning, computer vision, and natural language processing. Our team is dedicated to exploring the full potential of AI and how it can be applied to real-world challenges and industries. We continuously strive to advance the field of AI and explore new frontiers in this exciting and rapidly evolving field. Petanux’s contributions to the AI and big data industry have been recognized through the numerous high-impact publications that our team has generated.

List of Publications

Background and Objectives: Today, the increased number of Internet-connected smart devices require powerful computer processing servers such as cloud and fog and necessitate fulfilling requests and services more than ever before. The geographical distance of IoT devices to fog and cloud servers have turned issues such as delay and energy consumption into major challenges. However, fog computing technology has emerged as a promising technology in this field.
Methods: In this paper, service/energy management approaches are generally surveyed. Then, we explain our motivation for the systematic literature review procedure (SLR) and how to select the related works.
Results: This paper introduces four domains of service management and energy management, including Architecture, Resource Management, Scheduling management, and Service Management. Scheduling management has been used in 38% of the papers. Therefore, they have the highest service management and energy management. Also, Resource Management is the second domain that has been able to attract about 26% of the papers in service management and energy management.
Conclusion: About 81% of the fog computing papers simulated their approaches, and the others implemented their schemes using a testbed in the real environment. Furthermore, 30% of the papers presented an architecture or framework for their research, along with their research. In this systematic literature review, papers have been extracted from five valid databases, including IEEE Xplore, Wiley, Science Direct (Elsevier), Springer Link, and Taylor & Francis, from 2013 to 2022. We obtained 1596 papers related to the discussed subject. We filtered them and achieved 47 distinct studies. In the following, we analyze and discuss these studies; then we review the parameters of service quality in the papers, and ultimately, we present the benefits, drawbacks, and innovations of each study.

S. M. Hashemi  A. Sahafi  A. M. Rahmani  M. Bohlouli
2023 | Journal of Electrical and Computer Engineering Innovations (JECEI)

Predictive modeling of various systems around the world is extremely essential from the physics and engineering perspectives. The recognition of diferent systems and the capacity to predict their future behavior can lead to numerous signiicant applications. For the most part, physics is frequently used to model diferent systems. Using physical modeling can also very well help the resolution of complexity and achieve superior performance with the emerging ield of novel artiicial intelligence and the challenges associated with it. Physical modeling provides data and knowledge that ofer meaningful and complementary understanding about the system. So, by using enriched data and training phases, the overall general integrated model achieves enhanced accuracy. The efectiveness of hybrid physics-guided or machine learning-guided models has been validated by experimental results of diverse use cases. Increased accuracy, interpretability, and transparency are the results of such hybrid models. In this paper, we provide a detailed overview of how machine learning and physics can be integrated into an interactive approach. Regarding this, we propose a classiication of possible interactions between physical modeling and machine learning techniques. Our classiication includes three types of approaches: (1) physics-guided machine learning (2) machine learning-guided physics, and (3) mutually-guided physics and ML. We studied the models and speciications for each of these three approaches in-depth for this survey.

Azra Seyyedi, Mahdi Bohlouli, SeyedEhsan Nedaaee Oskoee
2023 | ACM Computing Surveys

Recommender systems have evolved beyond the basic user-item filtering methods in research. However, these filtering methods are still commonly used in real-world scenarios, mainly because they are easier to debug and reconfigure. Indeed the existing frameworks do not adequately support algorithmic tuning. Moreover, they are primarily focused on the reproducibility of state-of-the-art accuracy rather than ease of algorithm development and maintenance. Therefore, rapid and iterative experimentation and debugging are considerably hindered. In this work, we propose an AutoML-based framework with a modular deep session-based recommender code-base and an integrated automated HyperParameter Tuning (HPT4Rec) component. The proposed framework automates searching for the best session-based model for a given data. Therefore it can help to consistently update the model based on potential changes in the type and volume of data that is prevalent for a real-world scenario. It is demonstrated that HPT4Rec provides extensible data structures, training service compatibility, and GPU-accelerated execution while maintaining training efficiency and recommendation accuracy. We have conducted our experiments on the benchmark RecSys 2015 dataset and achieved performance on par with state-of-the-art results. Achieved results of our experiments show the importance of continuous and iterative parameter tuning, particularly for real-world scenarios.

Eva Zangerle and Günther Specht Amir Reza Mohammadi, Amir Hossein Karimi, Mahdi Bohlouli
2023 | the 34th workshop on basics of database systems (Grundlagen von Datenbanken)

With the advent of the Internet of Things (IoT) and GPS-enabled smartphones, a large volume of historical travel data has been collected. Having access to this type of big data made the Point of Interest (POI) recommendation feasible. Given the variety of available locations and the sparsity of users’ check-ins, the problem of recommending locations has been known to be a complex task. A significant volume of research has been devoted to POI recommendation systems using various features such as geographical distribution, social structure, and temporal behavioral patterns. In this paper, a novel Geographical-Temporal and Social Multi-Center POI recommendation (GeoTS) is introduced by adding social relations into the decision making criteria. We hypothesize that social relationships might have an impact on people deciding whether to visit a place or not. Extensive experiments on the real-world dataset (Gowalla) demonstrate GeoTS effectiveness with an improvement ratio of 7.03% and 7.76% for Precision and nDCG, respectively.

Simin Bakhshmand, Bahram Sadeghi Bigham, Mahdi Bohlouli
2022 | International Conference on Intelligent Systems Design and Applications

Purpose: Many people have been dying as a result of medical errors. Offering clinical learning can lead to better medical care. Clinics have conventionally incorporated direct modality to teach personnel. However, they are now starting to take electronic learning (e-learning) mechanisms to facilitate training at work or other suitable places. The objective of this study is to identify and prioritize the medical learning system in developing countries. Therefore, this paper aims at describing a line of research for developing medical learning systems.
Design/methodology/approach: Nowadays, organizations face fast markets’ changing, competition strategies, technological innovations and accessibility of medical information. However, the developing world faces a series of health crises that threaten millions of people’s lives. Lack of infrastructure and trained, experienced staff are considered essential barriers to scaling up treatment for these diseases. Promoting medical learning systems in developing countries can meet these challenges. This study identifies multiple factors that influence the success of e-learning systems from the literature. The authors have presented a systematic literature review (SLR) up to 2019 on medical learning systems in developing countries. The authors have identified 109 articles and finally selected 17 of them via article choosing procedures.
Findings: The paper has shown that e-learning systems offer significant advantages for the medical sector of developing countries. The authors have found that executive, administrative and technological parameters have substantial effects on implementing e-learning in the medical field. Learning management systems offer a virtual method of augmented and quicker interactions between the learners and teachers and fast efficient instructive procedures, using computer and Internet technologies in learning procedures and presenting several teaching-learning devices.
Research limitations/implications: The authors have limited the search to Scopus, Google Scholar, Emerald, Science Direct, IEEE, PLoS, BMC and ABI/Inform. Many academic journals probably provide a good picture of the related articles, too. This study has only reviewed the articles extracted based on some keywords such as “medical learning systems,” “medical learning environment” and “developing countries.” Medical learning systems might not have been published with those specific keywords. Also, there is a requirement for more research with the use of other methodologies. Lastly, non-English publications have been removed. There could be more potential related papers published in languages other than English.
Practical implications: This paper helps physicians and scholars better understand the clinical learning systems in developing countries. Also, the outcomes can aid hospital managers to speed up the implementation of e-learning mechanisms. This research might also enable the authors to have a role in the body of knowledge and experience, so weakening the picture of the developing country’s begging bowl is constantly requesting help. The authors hoped that their recommendations aid clinical educators, particularly in developing countries, adopt the trends in clinical education in a changing world.
Originality/value: This paper is of the pioneers systematically reviewing the adoption of medical learning, specifically in developing countries.

Mahdi Bohlouli, Omed Hassan Ahmed, Ali Ehsani, Marwan Yassin Ghafour, Hawkar Kamaran Hama, Mehdi Hosseinzadeh, Aram Mahmood Ahmed
2022 | Kybernetes

Although fog computing is a new research topic, there are robust and integrated solutions for service activation management and how to distribute IoT services over available fog computing services resources. This paper presents a multi-objective gray wolf optimization (GWO) solution for more efficient fog computing service scheduling and activation management. This solution utilizes a multi-objective function in the resource allocation process by developing and improving the proposed algorithm to check the status of resources and manage tasks. The main purpose of this study is to create a trade-off between energy consumption and task execution time. This research has presented a two-stage multi-objective approach to solve the scheduling and task offloading problem. First, the GWO algorithm is used to solve the scheduling problem, and then container migration is used to solve the task offloading problem and appropriate resource allocation. Container migration causes an idle physical server to turn off, reducing power consumption, improving imbalance, reducing latency, and improving efficiency. The proposed method has been compared with three scenarios with 700 nodes, 1000 nodes, and 5000 nodes. The proposed solution has been implemented and simulated in iFogSim and compared with five classical algorithms. The analytical results indicate the better performance of the proposed strategy with average execution time for host selection averages a reduction of 15, 20, 25, 20, and 21%, respectively, compared to Particle swarm optimization (PSO), Ant colony optimization (ACO), Grasshopper optimization algorithm (GOA), Genetic algorithm (GA), and Cuckoo optimization algorithm (COA), 12% reduction in average time required for reallocation container, and the service-level agreement (SLA) violation rate is maintained in the range of 9-10/% compared to all other solutions.

Sayed Mohsen Hashemi, Amir Sahafi, Amir Masoud Rahmani, Mahdi Bohlouli
2022 | IEEE Access

The Digital Transformation in Industry is one of the top technological drivers of change for the future of employment in the 4th Industrial Revolution. To help to face this challenge, the Erasmus+ SkoPs project aims at providing an inclusive empowerment of the European workforce by designing innovative methods and digital tools for blended teaching, training, learning, and assessment, including open-access IIoT online courses and webinars. To achieve this goal, the SkoPs team involves companies, local development agencies and HEIs from three different European countries. In this article the foundations, structure, pedagogical framework, and main intellectual outputs of the project are outlined. Also, a discussion on equity issues that may occur if not specifically considered during the training design is presented, together with a set of guidelines, templates and actions that have been put to place as part of the quality management and monitoring plan of the project to prevent such inequalities from occurring. Finally, to further contribute to augmenting the equity in education, the article discusses the need for such projects to be accompanied by a comprehensive communication plan to help not only disseminate and increase the impact of the training modules across European countries, but also to help the projects reach traditionally disadvantaged communities such as rural areas or women. The SkoPs communication plan includes examples of actions particularly designed for this purpose.

A Behravan, N Bogonikolos, M Bohlouli, C Cachero, P Kaklatzis, B Kiamanesh, S Luján-Mora, S Meliá, M Mirhaj, R Obermaisser
2022 | EDULEARN22 Proceedings

Internet of Things (IoT) is a new phenomenon that proposes novel business opportunities. IoT allows the world to be programmable and might provide several benefits for organizations. Based on the IoT survey, cyber-security issues are among the most extensive and complicated challenges faced by IoT devices. Threat detection is considered a preventive measure against malware threats, ransomware, and attacks, which become more serious each year because of the dramatic rise in malware attacks. This article investigates threat detection techniques that fall into three categories: malware detection, attack detection, and ransomware detection, published from 2017 to August 2021. We examine solutions, techniques, features, classifiers, and tools proposed by IoT researchers. Some questions are proposed, and answering the questions may help the researchers suggest a more efficient solution in future works. Furthermore, the achievement and disadvantages of each study are discussed. Finally, based on the reviewed studies, some open challenges and practical measures to future directions are suggested, worth further studying and researching threat detection techniques in the IoT.

Nasim Soltani, Amir Masoud Rahmani, Mahdi Bohlouli, Mehdi Hosseinzadeh
2022 | Concurrency and Computation: Practice and Experience

Internet of things network lifetime and energy issues are some of the most important challenges in today’s smart world. Clustering would be an effective solution to this, as all nodes would be arranged into virtual clusters, while one node will serve as the cluster head. The right selection of the cluster head will reduce energy consumption dramatically. This concept is more crucial for the internet of things, which is being widely distributed in environments such as forests or the smart agriculture sector. In this paper, an Energy Efficient Minimum Spanning Tree algorithm (EEMST) is presented to select the optimal cluster head and data routing based on graph theory for a multihop Internet of Things. This algorithm calculates the Euclidean distance-based minimum spanning tree based on a weighted graph. As a result, we use a weighted minimum spanning tree to choose the optimal cluster head and accordingly determine the shortest path for data transmission between member nodes and the cluster head. The proposed EEMST algorithm provides the possibility of intracluster multihop routing and also the possibility of intercluster single-hop routing. The simulated experimental results approve a significant improvement of the proposed algorithm in the IoT systems’ lifetime compared to the baselines.

Vida Doryanizadeh, Amin Keshavarzi, Tajedin Derikvand, Mahdi Bohlouli
2021 | Applied Artificial Intelligence

To become a standard part of our daily lives, autonomous vehicles must ensure human safety. This safety comes from knowing what will happen in the future. The most common approach in state-of-the-art methods for sensorimotor driving is behavior cloning. These models struggle to anticipate what will happen in the near future to better plan their actions. Humans do so by first observing what objects are present in the environment, and by studying their type and history, they can predict how they may evolve in the near future. Based on this observation, we first demonstrate the limitation of behavior cloning in making safe and reliable decisions. Then, we propose a hierarchical approach to teach an agent how to make safer decisions based on the plausible future. The key idea is instead of hand-picking future features we integrate a high-dimensional prediction module such as predicting future RGB/semantically segmented frames into our model to allow the model to learn the required features by itself. In the end, we demonstrate qualitatively and quantitatively that this approach yields safer decisions by the agent.

Mohammad Hossein Nazeri, Mahdi Bohlouli
2021 | IEEE International Conference on Data Mining (ICDM)

Considering the diversity of proposed cloud computing services in federated clouds, users should be very well aware of their current required and future expected resources and values of the quality-of-service parameters to compose proper services from a pool of clouds. Various approaches and methods have been proposed to accurately address this issue and predict the quality-of-service parameters. The quality-of-service parameters are stored in the form of time series. Those works mostly discover patterns either between separate time series or inside specific time series and not both aspects together. The main research gap which is covered in this work is to make use of measuring similarities inside the current time series as well as between various time series. This work proposes a novel hybrid approach by means of time-series clustering, minimum description length, and dynamic time warping similarity to analyze user needs and provide the best-fit quality-of-service prediction solution to the users through the multi-cloud. We considered the time as one of our important factors, and the system analyzes the changes over time. Furthermore, our proposed method is a shape-based prediction that uses dynamic time warping for covering geographical time zone differences with the novel preprocessing method using statistically generated semi-real data to fulfill noisy data. The experimental results of the proposed approach show very close predictions to the real values from practices. We achieved about 0.5 mean absolute error rate on average. For this work, we used the WS-DREAM dataset which is widely used in this area.

Amin Keshavarzi, Abolfazl Toroghi Haghighat, Mahdi Bohlouli
2021 | Iranian Journal of Science and Technology, Transactions of Electrical Engineering

Chronic Kidney Disease (CKD) is being typically observed as a health threatening issue, especially in developing countries, where receiving proper treatments are very expensive. Therefore, early prediction of CKD that protects the kidney and breaks the gradual progress of CKD has become an important issue for physicians and scientists. Internet of Things (IoT) as a useful paradigm in which, low cost body sensor and smart multimedia medical devices are applied to provide remote monitoring of kidney function, plays an important role, especially where the medical care centers are hardly available for most of people. To gain this objective, in this paper, a diagnostic prediction model for CKD and its severity is proposed that applies IoT multimedia data. Since the influencing features on CKD are enormous and also the volume of the IoT multimedia data is usually very huge, selecting different features based on physicians’ clinical observations and experiences and also previous studies for CKD in different groups of multimedia datasets is carried out to assess the performance measures of CKD prediction and its level determination via different classification techniques. The experimental results reveal that the applied dataset with the proposed selected features produces 97% accuracy, 99% sensitivity and 95% specificity via applying decision tree (J48) classifier in comparison to Support Vector Machine (SVM), Multi-Layer Perception (MLP) and Naïve Bayes classifiers. Also, the proposed feature set can improve the execution time in comparison to other datasets with different features.

Mehdi Hosseinzadeh, Jalil Koohpayehzadeh, Ahmed Omar Bali, Parvaneh Asghari, Alireza Souri, Ali Mazaherinezhad, Mahdi Bohlouli, Reza Rawassizadeh
2021 | Multimedia Tools and Applications

Nowadays, the volume and variety of generated data, how to process it and accordingly create value through scalable analytics are main challenges to industries and real-world practices such as talent analytics. For instance, large enterprises and job centres have to progress data intensive matching of job seekers to various job positions at the same time. In other words, it should result in the large scale assignment of best-fit (right) talents (Person) with right expertise (Profession) to the right job (Position) at the right time (Period). We call this definition as a 4P rule in this paper. All enterprises should consider 4P rule in their daily recruitment processes towards efficient workforce development strategies. Such consideration demands integrating large volumes of disparate data from various sources and strongly needs the use of scalable algorithms and analytics. The diversity of the data in human resource management requires speeding up analytical processes. The main challenge here is not only how and where to store the data, but also the analysing it towards creating value (knowledge discovery). In this paper, we propose a generic Career Knowledge Representation (CKR) model in order to be able to model most competences that exist in a wide variety of careers. A regenerated job qualification data of 15 million employees with 84 dimensions (competences) from real HRM data has been used in test and evaluation of proposed Evolutionary MapReduce K-Means method in this research. This proposed EMR method shows faster and more accurate experimental results in comparison to similar approaches and has been tested with real large scale datasets and achieved results are already discussed.

Mahdi Bohlouli, Zhonghua He
2021 | Companion Proceedings of the Web Conference

It is quite important how to correlate and find out relationships between how people grow up and succeed in their research field compared to their field’s grows. In many cases, people refer only to indices such as citation count, h-index, i10-index and compare scientists from different fields in a similar situation with the same variables. It is not a fair comparison since fields are different in development and being cited. In this paper, we used the acceleration concept from physics and propose a new method with new metrics to efficiently and fairly evaluate scientists according to the real-time analysis of their recent status compared to their field’s grows. This considers various inputs such as whether a person is beginner scientists or professional and applies all such key inputs in the evaluation. The evaluation is also over time. The results showed better evaluation compared to state-of-the-art metrics.

Mahdi Bohlouli, Jonathan Hermann, Fabian Sunnus
2021 | Companion Proceedings of the Web Conference

Children with autism spectrum disorders (ASDs) have some disturbance activities. Usually, they cannot speak fluently. Instead, they use gestures and pointing words to make a relationship. Hence, understanding their needs is one of the most challenging tasks for caregivers, but early diagnosis of the disease can make it much easier. The lack of verbal and nonverbal communications can be eliminated by assistive technologies and the Internet of Things (IoT). The IoT-based systems help to diagnose and improve the patients’ lives through applying Deep Learning (DL) and Machine Learning (ML) algorithms. This paper provides a systematic review of the ASD approaches in the context of IoT devices. The main goal of this review is to recognize significant research trends in the field of IoT-based healthcare. Also, a technical taxonomy is presented to classify the existing papers on the ASD methods and algorithms. A statistical and functional analysis of reviewed ASD approaches is provided based on evaluation metrics such as accuracy and sensitivity.

Mehdi Hosseinzadeh, Jalil Koohpayehzadeh, Ahmed Omar Bali,Farnoosh Afshin Rad, Alireza Souri, Ali Mazaherinezhad, Aziz Rezapour,Mahdi Bohlouli
2020 | The Journal of Supercomputing

Service monitoring in federated clouds generates large scale QoS time series data with various unknown, frequent and abnormal patterns. This could be associated with inaccurate resource provisioning and avoid violations through predictive and preventive actions. A sufficient intelligence in the form of expert system for decision support is needed in such situations. Therefore, the main challenge here is to efficiently discover unknown frequent and abnormal patterns from QoS time series data of federated clouds. On the other hand, QoS time series data in federated clouds is unlabeled and consists of frequent and abnormal structures. Studies showed that clustering is the most common and efficient method to discover interesting patterns and structures from unlabeled data. But, clustering is normally associated with time overhead that should be optimized as well as accuracy issues mainly in connection with convergence and finding an optimum number of clusters. This work proposes a new genetic based clustering algorithm that shows better accuracy and speed in comparison to state-of-the-art methods. Furthermore, the proposed algorithm can find the optimum number of clusters concurrently with the clustering itself. Achieved accuracy and convergence of the proposed method in the experimental results assure its use in expert systems, mainly for resource provisioning and further autonomous decision making situations in federated clouds. In addition to the scientific impact of this paper, the proposed method can be used by federated cloud service providers in practice.

Amin Keshavarzi, Abolfazl Torghi Haghighat, Mahdi Bohlouli2020 | Journal of Expert Systems with Applications

The main question in today’s rapidly changing world is how fast and what sort of corresponding knowledge should an agent be adopted to?! This can be defined as knowledge mapping problem for decision based on large scale datasets with veracity and accuracy as key criteria, especially in safety-critical systems. The following paper proposes a hybrid ans scalable approach for Multi-Criteria Decision Making (MCDM) problems that is deployed in MapReduce. The main sector specific problem that is solved is to recommend training resources that efficiently improves skill gaps of job seekers. The main innovations of this work are: (1) the use of large scale semi-real skill analytics and training resources dataset (Dataset Perspective), (2) a hybrid MCDM approach that resolves skill gaps by matching required skills to the training resources (Decision Support Perspective). This can be applied to any other sector with the context of matching problems. (3) the use of MapReduce as scalable processing approach to deliver lower processing latency and higher quality for large scale datasets (Big Data and Scalability Perspective). The experimental results showed 89% accuracy in the clustering and matching results. The recommendation results have been tested and verified with the industrial partner.

Mahdi Bohlouli; Martin Schrage2020 | IEEE International Conference on Big Data (Big Data)

Realization of a cloud enabled Knowledge modeling as a Service (KaaS) supports acquiring, managing and sharing knowledge through different types of sources. Cloud computing enables its users to access and edit their data easily without any administration and software development expenses and difficulties. In this paper, an overview and implementation results of the concept called Knowledge Forge is described. The proposed concept of this research is successfully tested and applied in the representation and management of job knowledge (competences) towards the assessment of talent’s proficiency. This can also be used for modeling of graduates and students proficiency in order to identify their competence gaps and assign proper learning sources to them in order to improve their competitiveness. The Knowledge Forge can be extended easily to a wide variety of applications in order to enable further knowledge formats and application areas. Thereby, new formats use their own graphical interface and methods, while no changes in the system itself are required. The main goal of this research is to enable skilled workers and knowledge engineers to search for, modify and connect different formats of knowledge(e.g. unstructured knowledge). Neither users nor administrators have to care about the data format, so a search may show better results out of all stored data along with external search results provided by web search engines. The result of this work is a framework for all KaaS purposes. Although its use is limited at the moment it creates a basis for new applications in several areas. Further work can enhance the functionality or simplify the enhancement process to enable even the users to create their own applications.

Mahdi Bohlouli, Sebastian Hellekes 2020 | IEEE International Conference on Big Data (Big Data)

Chronic Kidney Disease (CKD) is being typically observed as a health threatening issue, especially in developing countries, where receiving proper treatments are very expensive. Therefore, early prediction of CKD that protects the kidney and breaks the gradual progress of CKD has become an important issue for physicians and scientists. Internet of Things (IoT) as a useful paradigm in which, low cost body sensor and smart multimedia medical devices are applied to provide remote monitoring of kidney function, plays an important role, especially where the medical care centers are hardly available for most of people. To gain this objective, in this paper, a diagnostic prediction model for CKD and its severity is proposed that applies IoT multimedia data. Since the influencing features on CKD are enormous and also the volume of the IoT multimedia data is usually very huge, selecting different features based on physicians’ clinical observations and experiences and also previous studies for CKD in different groups of multimedia datasets is carried out to assess the performance measures of CKD prediction and its level determination via different classification techniques. The experimental results reveal that the applied dataset with the proposed selected features produces 97% accuracy, 99% sensitivity and 95% specificity via applying decision tree (J48) classifier in comparison to Support Vector Machine (SVM), Multi-Layer Perception (MLP) and Naïve Bayes classifiers. Also, the proposed feature set can improve the execution time in comparison to other datasets with different features.

Mehdi Hosseinzadeh, Jalil Koohpayehzadeh, Ahmed Omar Bali,Parvaneh Asghari, Alireza Souri, Ali Mazaherinezhad,Mahdi Bohlouli, Reza Rawassizadeh
2020 | Multimedia Tools and Applications

Outlier detection has received special attention in various fields, mainly for those dealing with machine learning and artificial intelligence. As strong outliers, anomalies are divided into point, contextual and collective outliers. The most important challenges in outlier detection include the thin boundary between the remote points and natural area, the tendency of new data and noise to mimic the real data, unlabeled datasets and different definitions for outliers in different applications. Considering the stated challenges, we defined new types of anomalies called Collective Normal Anomaly and Collective Point Anomaly in order to improve a much better detection of the thin boundary between different types of anomalies. Basic domain-independent methods are introduced to detect these defined anomalies in both unsupervised and supervised datasets. The Multi-Layer Perceptron Neural Network is enhanced using the Genetic Algorithm to detect new defined anomalies with a higher precision so as to ensure a test error less than that be calculated for the conventional Multi-Layer Perceptron Neural Network. Experimental results on benchmark datasets indicated reduced error of anomaly detection process in comparison to baselines.

Rasoul Kiani, Amin Keshavarzi, Mahdi Bohlouli2020 | The Applied Artificial Intelligence Journal

Language recognition has been significantly advanced in recent years by means of modern machine learning methods such as deep learning and benchmarks with rich annotations. However, research is still limited in low-resource formal languages. This consists of a significant gap in describing the colloquial language especially for low-resourced ones such as Persian. In order to target this gap for low resource languages, we propose a “Large Scale Colloquial Persian Dataset” (LSCP). LSCP is hierarchically organized in a semantic taxonomy that focuses on multi-task informal Persian language understanding as a comprehensive problem. This encompasses the recognition of multiple semantic aspects in the human-level sentences, which naturally captures from the real-world sentences. We believe that further investigations and processing, as well as the application of novel algorithms and methods, can strengthen enriching computerized understanding and processing of low resource languages. The proposed corpus consists of 120M sentences resulted from 27M tweets annotated with parsing tree, part-of-speech tags, sentiment polarity and translation in five different languages.

Hadi Abdi Khojasteh, Ebrahim Ansari and Mahdi Bohlouli2020 | the 12th Language Resources and Evaluation Conference (LREC)

This book presents outstanding theoretical and practical findings in data science and associated interdisciplinary areas. Its main goal is to explore how data science research can revolutionize society and industries in a positive way, drawing on pure research to do so. The topics covered range from pure data science to fake news detection, as well as Internet of Things in the context of Industry 4.0. Data science is a rapidly growing field and, as a profession, incorporates a wide variety of areas, from statistics, mathematics and machine learning, to applied big data analytics. According to Forbes magazine, “Data Science” was listed as LinkedIn’s fastest-growing job in 2017. This book presents selected papers from the International Conference on Contemporary Issues in Data Science (CiDaS 2019), a professional data science event that provided a real workshop (not “listen-shop”) where scientists and scholars had the chance to share ideas, form new collaborations, and brainstorm on major challenges; and where industry experts could catch up on emerging solutions to help solve their concrete data science problems. Given its scope, the book will benefit not only data scientists and scientists from other domains, but also industry experts, policymakers and politicians.

Mahdi Bohlouli, Bahram Sadeghi Bigham, Zahra Narimani, Mehdi Vasighi, Ebrahim Ansari2020 | Springer Verlag, Berlin