Indian Association for the Cultivation of Science
» Structure and Dynamics: Spectroscopy and Scattering (SDSS - 2023) » Hybrid Halide Perovskites 2023 » Notice No. 3.032/1032 dated 29.08.2023 on re-constitution of the Management Committee of IACS » Special Campaign 3.0 mandated by DST » DST Notification » Quantum Sensing & Quantum Metrology » A Conference on "Recent Trends In Condensed Matter Physics Related To Quantum Materials" » IACS Colloquium from the School of Physical Sciences by Prof. Anindya Das, Department of Physics, Indian Institute of Science, Bangalore on 23rd February 2024 (Friday) » International Conference on Catalysis (IC²-2024) » 75th Republic Day Celebration at IACS Premises on 26.01.2024 » National Science Day celebration at IACS on 28th February, 2024 » IACS Colloquium from School of Physical Sciences by Prof. Diptiman Sen, IISc Bangalore » SAIS Symposium 2024 on 1st & 2nd March 2024 » Admission to Integrated Bachelor's-Master's Program in Science and Master's/Integrated Master's-Ph.D. Program in Science - 2024 » Appointment of one Sports Officer on purely temporary and contractual basis (in Consultancy mode) initially for a period of two (2) years. » Last date of Applications/Nominations for appointment of Director has been extended up-to 30th April, 2024 » Admission to the joint M.Sc.-Ph.D. Program in “Chemical and Molecular Biology” for the academic year 2024-25 (IACS and IIT-KGP) » Observance of Swachhata Pakhwada-2024 at IACS from May 01,2024 to May 15,2024. » Schedule of Interview for PhD Admission (Autumn Semester 2024) for School of Applied and Interdisciplinary Sciences(SAIS) » Notice for in-operative PAN » Schedule of Interview for PhD Admission (Autumn Semester 2024)for School of Biological Sciences(SBS) » Schedule of Interview for PhD Admission (Autumn Semester 2024)for School of Chemical Sciences(SCS) » Schedule of Interview for PhD Admission (Autumn Semester 2024)for School of Mathematical and Computational Sciences (SMCS) » Schedule of Interview for PhD Admission (Autumn Semester 2024)for School of Physical Sciences (SPS)

Name : Debarshi Kumar Sanyal

Department : School of Mathematical & Computational Sciences
Designation : Assistant Professor
Contact : +91 33 2473 4971 (Ext:2334)
Email : debarshi.sanyal@iacs.res.in

 



Home

My current research focuses on natural language processing and modern digital libraries. We apply tools from machine learning, especially deep learning, to address the challenges. Example problems include extraction of keyphrases from documents, extraction of entities and relations from documents, identification of topics and their distribution in text corpora, and automatic question-answering. We are also interested in the theoretical understanding of deep neural networks. In the recent past, I have also worked on mobile computing and wireless networks.


Academic Profile

Academic Profile

 

Postdoc, IIT Kharagpur

Advisors: Prof. Partha Pratim Das (IIT Kgp), Prof. Plaban Kumar Bhowmick (IIT Kgp)

PhD (Engineering), Jadavpur University

Advisors: Prof. Samiran Chattopadhyay (JU), Prof. Matangini Chattopadhyay (JU)

BE (Information Technology), Jadavpur University

Project Advisor: Prof. Uttam Kumar Roy (JU)

 

Previous employment with: Infosys (Bangalore), Interra Systems (Kolkata), Xilinx (Hyderabad), KIIT Deemed University (Bhubaneswar), Jadavpur University (Kolkata) [Guest faculty]

   


Teaching

COM-4203/PHD-226 (Machine Learning): 2022 (Spring), 2021 (Spring).

COM-5103/PHD-130 (Advanced Machine Learning): 2021 (Autumn), 2020 (Autumn)

COM-2211 (Data Structures & Algorithms Lab): 2022 (Spring), 2021 (Spring) [shared]

COM-2101B (Data Structures & Algorithms): 2020 (Autumn) [shared]


Open Positions

Candidates interested to  pursue  research in machine learning / natural language processing are encouraged to apply. Note that, in order to join the PhD program of IACS,  the candidate must satisfy the eligibility criteria  as  mentioned in the advertisement for PhD admission published by the institute from time to time.


Research Activities

Research Activities

 


[My publications are available in Google Scholar.]

 

“Libraries store the energy that fuels the imagination,” said Sidney Sheldon. Web-scale digital libraries with their large collections of books, manuscripts, photographs, maps, microfilms, newspapers, periodicals, audio and video tapes  simplify and broaden access to information and knowledge. What will digital libraries of the future look like? How can artificial intelligence (AI) contribute to building better libraries and thereby, advance technology-enabled education? How can AI help us navigate our collective knowledge stored in the born-digital and digitized collections spanning  centuries of our existence on this planet? These questions are at the root of our scientific investigations where we intertwine AI and  information science to reimagine libraries of the future. What follows is a selection of the problems we are currently working on.

1. Scientific literature mining: Scholarly digital libraries allow researchers to discover and read research papers. However, given the huge rate at which papers, especially in science, engineering and medicine, are published today, it is difficult to closely follow even a narrow subfield or discover the right resources with existing search engines. This motivates our interest in new methods of analyzing and indexing scientific papers so that the information tucked away in them is easily discoverable by users. For example, we are investigating ways – inspired by deep learning techniques – to automatically extract keywords from a paper, infer the discourse structure of an article, generate useful summaries of papers, and find semantic similarity between a pair of  research  papers. We have leveraged semantic similarity between papers to build Surrogator, a tool that  retrieves open-access surrogates of access-restricted papers. Specifically, if a paper is behind a paywall but a very similar  paper from the same authors is available as an open-access document on the web, the latter is presented to a  user who cannot afford to access the former.

 

Sanyal, D. K., Bhowmick, P. K., Das, P. P., Chattopadhyay, S., & Santosh, T. Y. S. S. (2019). Enhancing access to scholarly publications with surrogate resources. Scientometrics, 121(2), 1129-1164.

 

2. Topic models for text corpora: Given a large collection of text documents, can we identify the salient themes running through the documents? These salient themes or topics can help readers navigate the collection quicky albeit at a high level. A reader interested in a specific topic can retrieve the documents focused on that topic and analyze them in depth. Several techniques ranging from non-negative matrix factorization to probabilistic graphical models to deep neural networks have been exploited to build topic models for text corpora. However, many of these algorithms do not scale well when the collection is large or there are too many topics. Therefore, their acceptance in real libraries is still  limited. We study these algorithms including their performance bottlenecks. We are also interested to apply topic modeling techniques to build navigation systems for large document collections.