I am broadly interested in data analytics and decision making methodologies and their applications in various fields. My research also has a focus on optimization-based clustering methodologies and the use of stochastic simulation in the predictive analysis and decision making for complex systems under uncertainty.
Clustering can be formulated as an optimization problem. We propose to use DEPSO (Differential Evolution Particle Swarm Optimization) to efficiently solve for an optimal clustering scheme as measured by some clustering validity indices. This paper won the Overall Best Conference Paper Award from the 2010 World Congress on Computational Intelligence (WCCI). WCCI is the biggest and one of the most important events for the computational intelligence field and hosts the International Joint Conference on Neural Networks (IJCNN), the IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), and the IEEE Congress on Evolutionary Computation (IEEE CEC).
>>Download Xu, R., Xu, J., Wunsch, D.C. 2010. “Clustering with Differential Evolution Particle Swarm Optimization”. IEEE World Congress on Computational Intelligence, Barcelona, Spain.
As the validity indices are usually data dependent and are designed to address certain types of data, the selection of different indices as the fitness functions may critically affect cluster quality. We compare the performances of eight well-known and widely used clustering validity indices, namely, the Calinski-Harabasz index, the CS index, the Davies-Bouldin index, the Dunn index with two of its generalized versions, the I index, and the silhouette statistic index, on both synthetic and real data sets in the framework of DEPSO-based clustering. According to the experimental results, we find that the silhouette statistic index stands out in most of the data sets that we examined. Meanwhile, we suggest that users reach their conclusions not just based on only one index, but after considering the results of several indices to achieve reliable clustering structures. >>Download Xu, R., Xu, J., and Wunsch, D.C. 2012. A Comparison Study of Validity Indices on Swarm Intelligence-Based Clustering. IEEE Transactions on Systems, Man and Cybernetics - Part B, 42:1243 - 1256.
Health Care Applications
Cancer classification using Default ARTMAP. High-throughput messenger RNA (mRNA) expression profiling with microarray has been demonstrated as a more effective method of cancer diagnosis and treatment than the traditional morphology or clinical parameter based methods. Recently, the discovery of a category of small non-coding RNAs, named microRNAs (miRNAs), provides another promising method of cancer classification. miRNAs play a critical role in the tumorigenic process by functioning either as oncogenes or as tumor suppressors. Here, we apply a neural based classifier, Default ARTMAP, to classify broad types of cancers based on their miRNA expression fingerprints. As the miRNA expression data usually have high dimensionalities, particle swarm optimization (PSO) is used for selecting important miRNAs that contribute to the discrimination of different cancer types. Experimental results on the multiple human cancers show that Default ARTMAP performs consistently well on all the data, and the classification accuracy is better than or comparable to that of the other popular classifiers. Also, the selection of informative miRNAs can further improve the performance of classifiers and provide meaningful insights into cancer researchers. >>Download Xu, R., Xu, J., Wunsch, D.C. 2009. “MicroRNA expression profile based cancer classification using Default ARTMAP”, Neural Networks 22(5) 774-780.
Simulation of the development of drug resistance and re-emergence of parasites. Emerging drug resistance in parasitology and its impact on human and animal health are of serious concern. Attempts by the parasitology community to address this issue led to the introduction of so-called selective therapy where a proportion of the population is left untreated. This has led to re-emergence of parasites that have heretofore been controlled. Using stochastic simulations, this article explores the tradeoff between drug resistance and re-emergence. More importantly, the article identifies the importance of the parasite fitness parameter vector and its role in drug resistance. Suggestions for further biological work and statistical analyses are also provided. >>Download Xu, J., Vidyashankar, A., and Nielsen, M. 2014. "Drug Resistance or Re-emergence? Simulating Equine Parasites", ACM Transactions on Modeling and Computer Simulation, 24(4), Article 20.
Risk-adjusted product portfolio optimization. This was in collaboration with IBM T.J. Watson Research Center. An NORTA sampling procedure was integrated with a risk-adjustd optimization engine to estimate CVaR and manage new product development portfolios from a risk-aware perspective. A web-based decision tool known as PRIME was developed. >>Download Subramanian, D., Huang, P., Pulavarthi, C., Xu, J., Sekhar, H., Zhan, S., Tripathi, S. and Kumar, S. 2010. “Risk-adjusted approach to optimize investments in product development portfolios”, IBM Journal of Research & Development, 54(3).
Optimal selection of media vehicles using customer databases. This paper investigates the problem where an organization must select among multiple media vehicles for a marketing campaign, and determine how many names from each vehicle to impress (the contact depth). The organization can estimate the return from contacting each prospective customer, and this return decreases as depth increases. Different vehicles have different marginal costs per impression, and may have minimum-spend requirements or activation costs to use the vehicle. Decisions are to be made to maximize profit subject to a constraint on the total amount spent. We proposed an optimization model and two computationally efficient methods that often lead to global optimal solutions under practical assumptions. The model is illustrated with two data sets >>Download Malthouse, E., Qiu, D., and Xu, J. 2012. Optimal Selection of Media Vehicles using Customer Databases. Expert Systems with Applications, 39:13035-13045.
A Metamodeling-Based Approach for Production Planning. In production planning, one of the major challenges for plan optimization lies in quantifying the dependence of the objective criterion (typically total cost) upon the decision variables that specify a release plan of jobs. Existing methods either fall short in capturing such a relationship, which involves non-stationary stochastic processes of a manufacturing system (e.g., the number of jobs over time), or require discrete-event simulation (DES) to evaluate the objective criterion for each candidate decision, which is time-consuming. To enable the accurate and precise estimation of the objective for any decision plan within a reasonable time, this work proposed a metamodeling-based approach. The metamodels take the form of difference equations, embody the high-fidelity of DES, and can be used to address "what if" questions in a timely manner. When embedded in the optimization of production planning, the metamodels can help to improve the quality and responsiveness of decision making. >>Download Li, M., F. Yang, R. Uzsoy, and J. Xu. 2016. A metamodel-based Monte Carlo simulation approach for responsive production planning of manufacturing systems. Journal of Manufacturing Systems 38, 114-133.
The impact of production/distribution system structure on product line proliferation strategies. Our research shows that companies with different production/distribution system structures (make-to-stock/make-To-order) have different product line proliferation strategies in a modular design environment. Our work was among the first study to shed light on product line design in a joint marketing and operations setting. >>Xu, J., Hopp, W.J., and Nelson, B.L. 2014. "The Impact of Production/Distribution System Structure on Product Line Proliferation Strategies", under revision for IIE Transactions.