CLIP (Contrastive Language-Image Pre-training) excels in zero-shot image classification across diverse domains, making it an ideal candidate for pre-labelling unlabelled datasets. This paper introduces three pivotal enhancements designed to elevate CLIP-based pre-labeling efficacy without the need for labeled data. First, we introduce prompt refinement using a large language model (GPT-3.5-Turbo) to generate more descriptive prompts, significantly boosting accuracy on various datasets. Second, we address overconfident predictions through confidence calibration, achieving improved results without the need for a separate labeled validation set. Lastly, we leverage the inductive biases of CLIP and DINOv2 through ensembling, demonstrating a substantial boost in zero-shot labeling accuracy. Experimental results across various datasets consistently demonstrate enhanced performance, particularly in handling ambiguous classes. This work not only addresses limitations in CLIP but also provides valuable insights for advancing multimodal models in real-world applications.
@article{mehtaenhancing,title={Enhancing Zero-Shot Image Classification: A Triad Approach with Prompt Refinement, Confidence Calibration, and Ensembling},author={Mehta, Raghav and Sundaraiah, Rakshith and Vadarevu, Sabarish and Karamcheti, Vijay},year={2024},volume={5},publisher={ADaSci},journal={Lattice Volume 1},url={https://www.researchgate.net/publication/388965828_Enhancing_Zero-Shot_Image_Classification_A_Triad_Approach_with_Prompt_Refinement_Confidence_Calibration_and_Ensembling},}
2020
Analyzing performance of deep learning techniques for web navigation prediction
The weblog is dynamic and its size is growing exponentially with time in terms of navigation sessions. These stored sessions are used for Web Navigation Prediction (WNP). Each user had varied behavior on the web so is their navigated sessions. With a variety of large dynamic sessions, the task of navigation prediction is becoming challenging. There is a need for an effective method to handle large sessions with multiple labels for predicting user desired information. This paper analyses the performance of Deep Learning techniques like Multi-Layer Perceptron and Long-Short Term Memory based on parameters like number of hidden units, number of layers, activation function, optimization function, learning rate, and batch size. The networks were trained on six experimental parameter setups to form 216 models. The performances of these models are evaluated on two real datasets: BMS and CTI. It has been observed that Long-Short Term Memory performs best on most of the setups.
@article{jindal2020analyzing,title={Analyzing performance of deep learning techniques for web navigation prediction},author={Jindal, Honey and Sardana, Neetu and Mehta, Raghav},journal={Procedia Computer Science},volume={167},pages={1739--1748},year={2020},publisher={Elsevier},doi={10.1016/j.procs.2020.03.384},url={https://www.sciencedirect.com/science/article/pii/S1877050920308504},}
Efficient web navigation prediction using hybrid models based on multiple evidence combinations
Modeling user(s) navigation sequences and predicting their preferences has been an interesting area of research. For Web Navigation Prediction (WNP) the Markov model(s) are predominantly used for analyzing and discovering user navigation patterns. One of the major issues with the Markov model is that it fails to predict for unclassified navigations. Presence of such navigations reduces the prediction power of the model. Deep machine learning models can be used to address unclassified navigations but their prediction ability deteriorates if training sessions are less in number. As Navigations have been modeled using N-Grams where the number of training sessions reduces at higher N-Grams. It might affect the performance of deep learning models. However, their prediction ability can be improvised by integrating it with the Markov model. This paper proposes three integrated models to minimize the unclassified navigations and to boost the overall prediction accuracy. Proposed hybrid models are formed by integrating All-Kth Markov Model with Deep Neural Network (DKM) and All-Kth Modified Markov Model with Shallow Neural Network and Deep Neural Network (SKMM and DKMM). The proposed models are evaluated on three standard datasets: CTI, BMS, and Wikispeedia. DKMM has obtained the best results in terms of improvement in prediction accuracy and reduction in unclassified navigations on higher N-grams. Prediction accuracy was improved up to 4.71, 6.2 and 7.67 in CTI, BMS and Wikispeedia dataset.
@article{jindal2020efficient,title={Efficient web navigation prediction using hybrid models based on multiple evidence combinations},author={Jindal, Honey and Sardana, Neetu and Mehta, Raghav},journal={International Journal of Computers and Applications},volume={42},number={7},pages={715--728},year={2020},publisher={Taylor \& Francis},doi={10.1080/1206212X.2019.1680011},url={https://www.tandfonline.com/doi/abs/10.1080/1206212X.2019.1680011},}
2019
Analysis and Visualization of User Navigations on Web
The web is the largest repository of data. The user frequently navigates on the web to access the information. These navigational patterns are stored in weblogs which are growing exponentially with time. This increase in voluminous weblog data raises major challenges concerning handling big data, understanding navigation patterns and the structural complexity of the web, etc. Visualization is a process to view the complex large web data graphically to address these challenges. This chapter describes the various aspects of visualization with which the novel insights can be drawn in the area of web navigation mining. To analyze user navigations, visualization can be applied in two stages: post pre-processing and post pattern discovery. First stage analyses the website structure, website evolution, user navigation behaviour, frequent and rare patterns and detecting noise. Second stage analyses the interesting patterns obtained from prediction modelling of web data. The chapter also highlights popular visualization tools to analyze weblog data.
@incollection{jindal2019analysis,title={Analysis and Visualization of User Navigations on Web},author={Jindal, Honey and Sardana, Neetu and Mehta, Raghav},booktitle={Data Visualization and Knowledge Engineering: Spotting Data Points with Artificial Intelligence},pages={195-221},year={2019},publisher={Springer International Publishing},doi={10.1007/978-3-030-25797-2_9},isbn={978-3-030-25796-5},}
Within the last decade, the advancement in automation of vehicles such as cars and planes promise to fundamentally alter the microeconomics of transporting people and goods. In this paper, we focus on the self-flying aircraft through computer vision. This subset of automated flight would be the most valuable in terms of efficiency, human error reduction and loss of life due to mid-air collisions. We present an analysis of control systems for collision avoidance for an aircraft.
@unpublished{dubeymehta2017,title={Self-Flying Aircraft},author={Mehta, Raghav and Dubey, Prakhar},year={2017},doi={10.13140/RG.2.2.23011.62245},note={preprint},}