Breakthrough discovery unveils potential treatment for hepatitis B

sorte · 22 February 2025 16:22

ThomasTu · 23 February 2025 07:19

Again, great find. I know one of the coauthors of this study (Rob Schwartz) and he’s a pretty smart cookie. This work is in quite preliminary stages and is only starting to be done in animal models. There is still a way to go to know if it will work in a patient.

Thomas

MMal_jr · 2 March 2025 02:43

Why are researchers not utilizing the current trends and advancement of AI, especially Quantum Computing? I believe it could fasten research and finding functional cure for Hep B. We’ve technologies that can already do this with minimal to no cost.

Kindly let’s explore… depression of having HBV is taking people more than the disease itself.

bob · 3 March 2025 22:19

Quantum computing is mostly at the scam-the-investors phase right now (including all the press releases from Google et al) but I do think we may get there in the future at some point, but half a dozen breakthroughs are needed first. I’ve looked into this matter very deeply

AI on the other hand is indeed woefully neglected. My personal bet is that an AI first company is likely to leapfrog all the existing developments. Some company that actually uses the AI available, perhaps one that isn’t even in the HBV research space (!), is likely to fully solve the problem and also get the funding / use the hype train to get it approved

This is my prediction. We can check back on this post in a couple of years

I think it would be a great time for an HBV researcher to seek funding using this strategy right now. They wouldn’t even have to be technically minded with computers - there are tools out there (and more being released almost every week) which should be able to help that are user friendly

For example I just generated this deep research report. These are tools that are available for researchers to use right now. I wonder if I’m missing the mark, or has any HBV researcher seen these tools so far?

AI-Driven Platforms for Hepatitis B Cure Research

Developing a cure for chronic Hepatitis B can leverage modern computational tools that integrate AI and bioinformatics. Below we highlight user-friendly platforms (both open-source and commercial) across key areas of drug discovery, genomics, clinical data analysis, and systems biology, emphasizing ease of use and AI/ML integration.

1. Drug Discovery and Molecular Docking Tools

AI-powered drug discovery platforms help identify potential antivirals, repurpose existing drugs, and model how compounds bind to Hepatitis B virus (HBV) targets. Many offer intuitive interfaces for molecular docking and simulation:

PyRx (Virtual Screening Tool) – An open-source GUI for computer-aided drug design that includes an easy-to-use docking wizard. PyRx incorporates AutoDock Vina and AutoDock 4 for protein–ligand docking, letting users prepare compounds, run virtual screens, and analyze binding poses without coding (PyRx - Virtual Screening Tool download | SourceForge.net). It simplifies each step of virtual screening through a point-and-click interface, making it ideal for researchers new to molecular modeling.
Tamarind Bio (No-Code AI Drug Discovery) – A cloud-based platform providing no-code access to AI models for drug discovery. It offers web interfaces for tasks like protein structure prediction (AlphaFold), molecular docking (including DiffDock, a deep learning docking model), and even protein design (Harnessing the Power of AI in Bioinformatics: The Tamarind Bio Approach) (Harnessing the Power of AI in Bioinformatics: The Tamarind Bio Approach). Researchers can paste sequences or upload structures and let AI models run in the backend, removing the need for local high-performance computing. For example, the interface for AlphaFold on Tamarind Bio allows users to input an amino acid sequence and receive a predicted 3D structure via email (see image below). This approach lowers the barrier for life scientists to apply cutting-edge AI tools in HBV research.
(Launch YC: Tamarind Bio: No-code bioinformatics for scientists | Y Combinator) Tamarind Bio’s web interface for AlphaFold structure prediction – an example of a user-friendly, cloud-based AI tool. Researchers can submit an HBV protein sequence and get a predicted structure without any coding, aiding structure-based drug design (Harnessing the Power of AI in Bioinformatics: The Tamarind Bio Approach).
SwissDock (Web-Based Docking) – A free online docking server from the Swiss Institute of Bioinformatics that requires only a protein structure and ligand input. SwissDock’s purpose is to provide a “user-friendly interface to predict molecular interactions” between a target protein and a small molecule (SwissDock). It uses docking engines (Attracting Cavities and AutoDock Vina) on the backend. SwissDock is accessible via a browser, making it convenient for screening compounds against HBV proteins or host receptors with no software installation (SwissDock).
DiffDock (AI-Based Docking) – A deep learning approach to ligand–protein docking. While the model itself is code-based, it’s available through web platforms like Tamarind Bio for ease of use (What is the best and most user friendly online tool for Molecular docking? | ResearchGate). DiffDock treats docking as a generative modeling problem to predict likely binding poses without exhaustively sampling orientations. Through a simple web form, researchers can perform “blind docking” (no predefined binding site) to predict where a small molecule might bind an HBV protein (What is the best and most user friendly online tool for Molecular docking? | ResearchGate). This AI-driven method can complement traditional docking by exploring novel binding modes.
Commercial Suites (Schrödinger, MOE) – Industry-grade drug discovery software packages offer user-friendly graphical environments and often integrate AI modules. For example, Schrödinger’s Maestro interface includes the Glide docking tool (for protein–ligand docking) and machine-learning-based scoring, all in a polished GUI (What is the best and most user friendly online tool for Molecular docking? | ResearchGate). Such platforms (along with tools like CCDC’s GOLD or BIOVIA Discovery Studio) provide powerful molecular modeling capabilities – from homology modeling of HBV proteins, to pharmacophore modeling and virtual screening – with extensive documentation and support. These are pricier solutions but are known for their robust features and user support.

Notably, AI is accelerating drug discovery by enabling the analysis of vast chemical spaces and biological data. Platforms like MolProphet exemplify this trend – it’s an AI-based drug discovery platform aiming to make advanced computational tools accessible to every researcher via the cloud ((PDF) MolProphet: A One-Stop, General Purpose, and AI-Based Platform for the Early Stages of Drug Discovery) ((PDF) MolProphet: A One-Stop, General Purpose, and AI-Based Platform for the Early Stages of Drug Discovery). Such tools can rapidly screen huge compound libraries for HBV inhibitors using pretrained deep neural networks, predict binding affinities, and even generatively design new molecules. The takeaway is that whether one chooses open-source tools like AutoDock (via PyRx) or proprietary AI-driven platforms, there are many user-friendly options to kickstart in silico drug discovery for Hepatitis B.

2. Viral Genome Analysis and Sequencing Tools

Understanding the HBV genome – its mutations, genotypes, and potential gene editing strategies – is crucial for a cure. Several bioinformatics platforms make viral genome analysis accessible, even to those without programming skills:

Geneious Prime – A commercial software suite well-regarded for its intuitive interface in sequence analysis. Geneious integrates tools for read mapping, genome assembly, alignment, phylogenetics, and cloning. It has been used widely for viral genome assembly and mutation tracking. The philosophy of Geneious is to “make bioinformatics accessible by providing an intuitive, user-friendly interface that transforms raw sequence data into meaningful visualizations.” (Geneious | Bioinformatics Software for Sequence Data Analysis). Researchers can assemble an HBV genome from NGS reads, call variants, and visualize mutations along the genome with just a few clicks. The GUI also supports designing primers or gRNAs for CRISPR, with real-time feedback on target sites – helpful for exploring gene editing approaches on HBV DNA or cccDNA.
Galaxy Platform (usegalaxy.org) – An open-source, web-based platform for data-intensive biomedical research that is designed to be accessible to scientists without programming experience (About Galaxy - Galaxy Community Hub). Galaxy provides a wide range of tools for genomic analysis through a simple web UI. For example, one can use Galaxy to perform HBV read alignment, variant calling, and genome assembly by selecting tools from menus and connecting them in a workflow. The platform emphasizes reproducibility and transparency – analyses can be shared or repeated easily (About Galaxy - Galaxy Community Hub). Galaxy also has tools for CRISPR guide RNA design and off-target analysis. Researchers can simulate CRISPR edits by aligning post-edit reads or using tools like CRISPResso in Galaxy to evaluate gene editing outcomes. Because it’s cloud-accessible and free, Galaxy is ideal for those starting from scratch with limited infrastructure.
Nextstrain and Phylogenetics Tools – For those interested in HBV evolution and epidemiology, Nextstrain is a platform that visualizes viral mutation data and phylogenetic trees in an interactive web interface. While originally built for pathogens like influenza and SARS-CoV-2, it can be applied to HBV sequences to track genotype distributions and mutation patterns over time. Similarly, tools like MEGA (for phylogenetic analysis) and PhyloSuite have user-friendly GUIs to construct evolutionary trees of HBV strains, which can identify mutations associated with drug resistance or immune escape. These tools help researchers new to bioinformatics derive insights from viral sequence data without needing to manually write code.

In summary, whether assembling viral genomes, tracking mutations, or designing CRISPR edits, there are accessible bioinformatics tools available. They offer drag-and-drop interfaces or guided workflows so researchers can focus on interpreting HBV data rather than technical details. The above platforms (among others like CLC Genomics Workbench, IGV for visualization, etc.) ensure that even those “starting from scratch” can analyze the HBV genome and explore therapeutic genome editing strategies.

3. AI in Clinical Trials and Biomarker Discovery

Identifying biomarkers and analyzing patient data can guide treatment strategies for chronic Hepatitis B. AI-driven platforms are emerging to sift through clinical and multi-omics data to find patterns, suggest targets, and optimize clinical trial designs:

Genialis ResponderID – An example of a machine-learning-based biomarker discovery platform. It uses a “biology-first framework to model underlying disease biology” and has been applied in oncology to build predictive gene signatures (ResponderID™: The Biology-First Machine Learning Platform for Biomarker Discovery). In context of Hepatitis B, a platform like this could integrate patient gene expression profiles, immune markers, and clinical outcomes (response vs non-response to therapy) to discover biomarkers that predict treatment response or disease progression. Genialis’ platform is cloud-based and comes with a user-friendly interface for scientists to input data and interpret results. It was recognized for its innovative approach to modeling complex gene signatures and translating them into clinically useful predictors (ResponderID™: The Biology-First Machine Learning Platform for Biomarker Discovery). Such AI tools can accelerate finding which patients might benefit from novel therapeutic approaches (like immune therapies or combination treatments in HBV).
IBM Watson Health (Clinical Trial Insights) – IBM’s AI has been applied to healthcare data for pattern recognition and hypothesis generation. One aspect is analyzing clinical trial databases and real-world patient data to match patients to trials or identify signals. While Watson for Drug Discovery was a product that ingested literature and omics data to propose targets, similar AI engines can be tuned for HBV. For instance, an AI model could analyze thousands of HBV patient records (labs, viral load, liver enzyme levels, biopsy results) and find latent subgroups or key risk factors that human analysis might miss. Although not a specific “product” to cite, the concept is that AI can mine patient datasets for correlations (e.g., a certain cytokine profile that predicts response to interferon therapy), effectively surfacing potential biomarkers or even drug repurposing opportunities.
H2O.ai AutoML & KNIME – For researchers with their own clinical datasets, there are user-friendly machine learning platforms to analyze them. KNIME Analytics Platform, for example, is an open-source tool with a visual workflow interface (no coding required) for data analysis and ML. It allows users to drag-and-drop “nodes” to read data (e.g., a spreadsheet of patient parameters), clean and transform it, then apply ML algorithms like classification or clustering. KNIME is praised as “highly accessible, no-code, drag-and-drop visual programming” for building even complex machine learning models (What is KNIME? An Introductory Guide | DataCamp). A researcher could use KNIME to build a model predicting which chronic HBV patients will develop cirrhosis, by training on lab results and genetic data. Similarly, AutoML tools (from H2O.ai or Google Cloud AutoML) provide a web interface where you upload a dataset and the tool automatically tries many algorithms to find the best predictive model – useful for those unfamiliar with ML tuning. These platforms often include explanations of the model, assisting in biomarker identification (e.g., highlighting that “age” and “HBsAg level” were the top predictors in the model).
Clinical Trial Simulation Platforms – Some specialized software (e.g., Certara’s Simcyp or Trial Simulator) incorporate AI and modeling to simulate clinical trial outcomes. They allow users to model virtual populations and drug interactions. While primarily pharmacokinetic/pharmacodynamic modeling tools, when combined with AI optimization, they can help design efficient trials (like optimizing dosing regimens for a new HBV drug). Newer AI platforms also scrape clinical trial registries to suggest repurposing opportunities – for example, identifying that a drug in trials for another disease has a mechanism that could target HBV.

Overall, AI-driven clinical data platforms range from broad-purpose analytics software to bespoke biotech company offerings. For a researcher beginning in this area, KNIME or Orange (another visual data mining tool) would be approachable starting points to apply machine learning on their Hepatitis B patient data. On the other end, collaborating with AI biotech firms or using specialized biomarker discovery services can yield deeper insights – these often have friendly user portals or reports so that even without AI expertise, one can interpret the findings.

4. Systems Biology and Pathway Analysis Platforms

Curing HBV may require understanding the virus’s interplay with host cellular pathways. Systems biology tools help map these interactions and identify key nodes (genes or proteins) that could be targeted by drugs. The following platforms assist in network-based drug discovery and pathway analysis in a user-friendly manner:

Cytoscape – A widely-used open-source software for visualizing and analyzing molecular interaction networks and pathways. Cytoscape provides a powerful GUI to import interaction data (like protein-protein interactions between HBV proteins and human proteins) and then visualize these as networks. According to its description, “Cytoscape is an open source software platform for visualizing complex networks and integrating these with any type of attribute data.” (Cytoscape: An Open Source Platform for Complex Network Analysis …). For example, one could take a list of human genes known to interact with HBV (from a database or literature) and use Cytoscape to create a network graph. Through its visual tools, one might spot that HBV X protein interacts with multiple host signaling proteins – indicating a hub that could be druggable. Cytoscape’s interface allows users to style networks (color nodes by importance, etc.), run network analysis algorithms (to find clusters or key connectors), and incorporate expression data (like highlighting which host genes are upregulated in infection). With many plug-ins available (for enrichment analysis, clustering, etc.), it brings sophisticated network analysis to those without programming – all through menus and dialogs.
(Screenshots) An example network visualization in Cytoscape, demonstrating how complex interaction networks can be explored via an intuitive interface (Cytoscape: An Open Source Platform for Complex Network Analysis …). Nodes and edges (here representing biological entities and their interactions) can be analyzed to find key pathway connections, aiding in identifying potential drug targets.
Ingenuity Pathway Analysis (IPA) – A commercial pathway analysis tool by QIAGEN, known for its extensive curated knowledge base. IPA provides a web-based interface where users can input a gene list (say, genes differentially expressed in HBV-infected vs healthy liver tissue) and get back enriched pathways, interaction networks, and predicted upstream regulators. It’s touted as an all-in-one application that “enables analysis, integration, and understanding of data … and building of interactive models of experimental systems,” helping find the significance of molecules or candidate biomarkers in the context of larger biological networks (QIAGEN Ingenuity Pathway Analysis (IPA)). For instance, using IPA a researcher might discover that HBV infection data points to activation of the TGF-β signaling pathway, and IPA would show a network of how HBV proteins, host cytokines, and transcription factors interact in that pathway. The interface is user-friendly: mostly point-and-click to run analyses and visualize network diagrams, with the heavy lifting done by IPA’s backend and curated database.
NetworkAnalyst (Web) – An open web platform for network-based analysis that doesn’t require installation. NetworkAnalyst allows users to upload gene/protein lists and performs instant protein–protein interaction network construction, module detection, and visualization in the browser (
NetworkAnalyst - integrative approaches for protein–protein interaction network analysis and visual exploration - PMC
). It was designed to be “a web-based user-friendly tool that integrates all essential steps of network analysis” (integrative approaches for protein–protein interaction network …). For an HBV researcher, this means you could take a set of host proteins affected by HBV and, through a simple web form, generate an interaction network, identify subnetworks (e.g., a cluster of immune-related proteins), and do pathway enrichment – all via interactive web charts and without coding. The platform provides tutorials and tips, making network analysis approachable even for bench scientists (
NetworkAnalyst - integrative approaches for protein–protein interaction network analysis and visual exploration - PMC
). It effectively wraps the complexity of network algorithms into an easy GUI.
Pathway Databases & Tools (KEGG, Reactome) – There are also user-friendly web interfaces for exploring known biological pathways that can be relevant for HBV. KEGG has an online pathway mapper where one can highlight a list of genes on pathway diagrams (for example, highlighting which immune pathways HBV modulates). Reactome Pathway Browser is another tool: it provides an interactive path diagram for human pathways and allows overlaying expression data. A researcher could load a list of up/down-regulated host genes during HBV infection to see which pathways light up. These tools, while not AI-driven per se, are invaluable for interpreting systems biology data and have intuitive web interfaces or stand-alone apps.
Host–Virus Interaction Databases – Specialized resources like VirHostNet or Viruses.STRING compile virus-host protein interactions and often provide visualization interfaces (VirHostNet 2.0: surfing on the web of virus/host molecular …). For example, VirHostNet 2.0 offers a web interface to query HBV and see its interaction network with host proteins, including a network viewer (similar to Cytoscape) to explore those connections (VirHostNet 2.0: surfing on the web of virus/host molecular …). Such platforms are useful starting points – they present a ready network one can explore without any data processing. Layering AI on top, one could imagine using network analysis algorithms (some are integrated in these platforms) to predict new interactions or essential nodes (using measures like network centrality) that could be therapeutic targets.

Through these systems biology tools, even newcomers can perform complex analyses: building interaction maps of HBV with host cell pathways, identifying critical “hub” proteins in those networks, and exploring how drug candidates might affect the network. Notably, some tools now incorporate AI/machine learning for network predictions – for example, using machine learning to predict new virus-host interactions or to prioritize drug targets based on network topology (Network embedding unveils the hidden interactions in the …). But even at a basic level, the ability to visually navigate pathways and networks in a point-and-click manner is incredibly helpful. It allows researchers to form hypotheses (e.g., “HBV proteins heavily interact with the p53 tumor suppressor network, so maybe boosting p53 could be a strategy”) that can then be tested experimentally or through further computational modeling.

5. Cloud-Based and Open-Source Platforms

For researchers starting from scratch (with limited local resources), cloud platforms and open-source tools provide accessible computing environments and collaborative features. These platforms lower barriers by offering ready-to-use software and scalable hardware via the internet:

Galaxy Cloud Instances – As mentioned, Galaxy is not only open-source but also available as public servers (e.g., usegalaxy.org). It’s “browser-accessible” and even allows launching one’s own instance on the cloud (About Galaxy - Galaxy Community Hub). New users can utilize Galaxy without any installation, performing analyses on cloud hardware transparently. This is ideal for heavy tasks like whole-genome sequencing analysis or machine learning on large datasets, as the user doesn’t need to manage the compute backend. Galaxy’s philosophy of accessibility and reproducibility (About Galaxy - Galaxy Community Hub) makes it a top choice for beginners in computational biology, effectively turning a web browser into a full bioinformatics lab.
KBase (DOE Systems Biology Knowledgebase) – KBase is a free, open web-based platform particularly oriented towards systems biology and genomics. It allows users to perform bioinformatic workflows through a GUI, and importantly to share and reproduce those analyses. KBase is noted as “a platform where anyone can conduct sophisticated and reproducible bioinformatic analyses via a graphical user interface.” (Frontiers | Bioinformatic Teaching Resources – For Educators, by Educators – Using KBase, a Free, User-Friendly, Open Source Platform). It includes apps for data uploading, assembly, annotation, metabolic modeling, etc. For someone working on HBV, KBase could be used to, say, compare HBV genomes, analyze transcriptomic responses, or even simulate metabolic interactions in the liver. All of this can be done in the cloud on KBase’s servers, with the user dragging and connecting analysis modules in a narrative interface. Since it’s collaborative, it also enables sharing results with colleagues or the community.
Terra (Broad Institute’s Terra.bio) – Terra is a cloud-native platform (developed by Broad/Google/Microsoft) for biomedical data analysis that has gained popularity for large-scale genomics. It is described as “a secure, scalable, open-source platform for biomedical researchers to access data, run analysis tools and collaborate.” (Accelerating biomedical research with Terra and Google Cloud). Terra provides a user-friendly web portal where one can create workspaces, upload data (or access public datasets), and run workflows or interactive notebooks (like Jupyter) without worrying about the cloud infrastructure details. For example, one could run the GATK workflow for variant calling on HBV patient samples through Terra with a few clicks. Terra’s built-in workflow library, and the ability to use notebook environments with pre-installed libraries, makes it flexible for both coding and no-coding users. It basically packages the power of Google Cloud in a researcher-friendly interface. (Terra is free to use; you only pay for cloud compute/storage consumed, and it often has credits or free tiers for testing.)
Google Colab and Jupyter Notebooks – For AI and machine learning tasks, Google Colaboratory provides free GPU-enabled notebooks accessible via the browser. While it does require some Python coding, it removes the setup barrier – no need to install Python or libraries locally. There are numerous shared notebooks for tasks like training a machine learning model on biomedical data or using libraries like DeepChem (for drug discovery) and Biopython (for sequence analysis). A researcher starting on AI for HBV can find a Colab notebook for, say, predicting drug-target binding with a neural network, and simply run it step-by-step in the browser. This way they harness cloud GPUs and pre-configured environments at no cost. JupyterHub instances (multi-user Jupyter notebook servers) are also available via cloud or university resources, providing a similar user-friendly coding environment that is immediately ready to use.
Open-Source Libraries and Reproducible Environments – It’s worth noting the plethora of domain-specific libraries (like RDKit for chemoinformatics, Bioconductor in R for genomics, etc.) which are free. Platforms like Colab or Kaggle Notebooks allow using these libraries with minimal setup. Additionally, containerization (Docker/Singularity) is often leveraged in cloud platforms to encapsulate bioinformatics tools. For the end-user, this means one-click access to complex pipelines. For example, a novice could pull a Docker image of an HBV analysis pipeline and run it on their laptop or cloud instance without dealing with dependencies. Many open-source tool developers provide such images or workflows to ease usage. Tools like Nextflow and Snakemake have accompanying graphical front-ends or use in Galaxy/Terra, combining the power of open-source pipelines with user-friendly execution.

In essence, modern cloud and open-source ecosystems ensure that even a small lab or an individual researcher can perform cutting-edge computational biology for Hepatitis B research. With minimal investment, one can assemble genomes, screen drug compounds, run machine learning models, and analyze networks – all through web interfaces or simple interactive scripts. The focus on user experience (many platforms explicitly advertise being accessible to non-programmers) means the learning curve is greatly reduced. This empowers researchers to start computational explorations “from scratch” and quickly contribute to the multi-disciplinary effort of curing chronic Hepatitis B.

ThomasTu · 5 March 2025 21:52

Thanks @Bob again for bringing such detailed information.

I am not convinced that the use of AI is “woefully neglected” in HBV research. As a rule, molecular and clinical research is slow relative to these advances in AI - I would estimate from idea conception to actually publishing a piece of work is 3-10 years. One of the most important things to good labs is that the data needs to be as robust as possible - this means time, funding, and meticulous efforts.

On the other hand, the tolerance for things being inaccurate in AI appears to be much higher (as we see from the very confidently incorrect outputs of ChatGPT and other generative AIs). There’s also a huge amount of cash being splashed about for these technologies. It isn’t fair to do a direct comparison of the two.

Hope this provides some context from the researcher community. I’d love for other @ScienceExperts to comment too.

TT

availlant · 5 March 2025 22:29

Dear @bob ,

To add from Thomas’ comments, it is important to remember that AI can only be as good as its training dataset. For instance, the rules of chess are easily adapted effortlessly by AI because 100% of the applicable training data set is known.

Platforms like Chat GPT and Grok can only understand and apply what is already known. They are currently competent for writing and assembling data from the internet but currently they fail in answering complex questions we already know the answers to.

For biological systems, even though there is a lot of knowledge to add to a teaching dataset, is it certainly not complete and likely only a small fraction of what can be known (or needs to be known).

For instance, you can try to ask AI to solve crystal structures of proteins which are not yet known because they are difficult to crystalize, but there is no way of knowing if these proposed structures are correct. We certainly do not understand everything that influences the cyrstal structure of a protein.

The same goes for HBV. We cannot ask AI to tell us what the answer is for curing HBV because it would have to know all of the components of the puzzle and we are not close to this.

@availlant

bob · 5 March 2025 22:30

Thanks Thomas. I realized I might have been too strongly opinionated after posting that. I’m just concerned that things are changing fast and most the people I know are not taking advantage of it

I have a different perspective because I’m a programmer and I’ve seen firsthand the level of code this can output if used well. Things that took me decades to learn, it can one-shot something that would literally take me months of work. Most programmers seem to be in denial about it because it really is surreal. Kind of makes you feel stupid spending all those years studying it in some ways. And it’s coming for other industries too. Tho overall I love it and it’s exciting

I would encourage everyone who can to sign up for Google Co-Scientist here: Google AI co-scientist Trusted Tester Program (because it’s very limited access so if all the researches try it maybe one will get it)

It seems like it should be very good: Accelerating scientific breakthroughs with an AI co-scientist

bob · 5 March 2025 22:36

They are getting better at working from limited data. Just remember that things like AlphaFold do get the protein structures correct the majority of the time, but most importantly, they are the worst they will ever be right now and they are improving exponentially

The latest reasoning models in LLMs show them filling in the gaps in missing knowledge very effectively

With that said labs will always be necessary for providing data, of course

john.tavis · 6 March 2025 15:42

Hi @bob

AI in HBV research is not neglected by any means. Labs around the world are rapidly adopting it. For example, my lab has made substantial advances with AlfaFold 2 and 3, and we heavily us the Schrodinger molecular analyses suite, including its in silico screening capabilities. My lab is far from the only one adopting it.

However, AI is not a cure-all. It needs a very clean and large database to work well. In those cases, it can readily identify patterns that humans cannot detect unaided when the question asked of the database is carefully defined.

To use AlphaFold as an example: it uses essentially the full protein coding sequence database as its reference (a HUGE amount of data), and it asks a very tightly defined question. Even then its results are wrong about 15-20% of the time, and the structures provided are often not of high enough resolution for some uses that would be desired. Our use of AlphaFold 2 and 3 provided the first useful structure of the HBV polymerase, and it has been super-helpful in our enzyme engineering studies. However, the structure is not of high enough resolution or is slightly wrong because in silico screening of millions of compounds against 3 novel targets on the enzyme have not yielded any active compounds.

In cases where the database is small, corrupted, or is not fully applicable to the question asked, AI will often simply hallucinate an answer. ChatGPT is a good example–I’ve asked it questions that I know the answer to because I did the experiments that provided the answer, and it has returned gibberish. Papers are appearing in the literature that were entirely produced by ChatGPT that are wholly fictional.

Also, these very large language models that incorporate essentially all of the data on the internet are highly subject to data contamination because they cannot tell true data from mistaken or fraudulent information, and because their own mistakes become part of the database for future searches. Their accuracy will steadily decline unless the programmers figure a way out of the rapidly expanding data contamination problem.

Therefore, AI needs to be used in a judicious, well informed manner. In those cases it can be a wonderful tool. It is just not the answer to everything, and it is not being neglected by the research community.

John.

bob · 6 March 2025 21:05

Thanks for the clarification John. I’m actually delighted to hear I was completely wrong about that and that you and others are on the ball with all this new and quickly changing stuff

All I want is for is for the world to see a cure as soon as it possibly can

The data contamination issue is being worked on. Data curation and refinement has always been 90% of the job in building any machine learning model from the beginning so this is just another iteration of that

availlant · 7 March 2025 09:44

You made very important comments here John.

bob · 8 March 2025 23:39

Apologies everyone for making a completely incorrect assumption

In all other aspects of my life and in other industries I see people missing the use of it, many people disregarding it for various strange reasons, etc. When they could be benefitting from it

So it’s great to find out the scientists here are the exception and that you are all fully on the ball. That really is good news

The reason why I’m so insistent about AI is because I think this is where we’re at

ThomasTu · 13 March 2025 06:28

Hi @bob,

Please don’t be sorry at all, you have raised fantastic points and discussed it in good faith. I love that you are passionate enough to bring this useful information to everyone’s attention and stimulate conversation.

Thomas

ImHopefull · 14 March 2025 22:53

Hi, Bob! I have been reading a lot about AI, and it’s really impressive what this technology can do. The majority of experts in this field say the same thing as you did. Some even claim that we are about to reach AGI very soon. Some predict it could happen by 2028, while others mention 2025 or 2026. I don’t know if this is partly due to hype, but something truly remarkable seems to be emerging from AI.

ThomasTu · 28 April 2025 00:48

Dear all,

Just an update that this study has been discussed on This Week in Virology (discussion starts around 48:38):

TT

bob · 28 April 2025 21:23

Thanks for that. I listened to it all, it was good to see your name mentioned. It’s great knowing that you’re out there working on stuff. There couldn’t be a better face for the mission

Also this drug does seem extremely interesting. I’ve never heard of anything being so effective at such low concentrations in this space. Besides its effects on CCCDNA, it will be good to see how much it suppresses the hbsag produced by integrated HBV. @availlant is usually brilliant at debunking things that look good on paper, I wonder if you have any comments here?

availlant · 29 April 2025 01:00

Dear @bob,

As always, the evolution of our understanding of the molecular biology of cccDNA is fascinating but as we know there is a very long distance from work in primary human hepatocytes and clinical impact.

I believe we are taking about CBL137, a molecule described here which has inhibitory effects on full length HBx transcription and HBV infection.

There are two problems with targeting chromatin function to inhibit transcription from cccDNA:

Generally heterochromatin (condensed DNA) is inactive and insoluble within the nucleus. Inactive cccDNA is the same. While a molecule like CBL137 may be active against euchromatin (uncondensed, transcriptionally active DNA) like that found in active cccDNA, it is unlikely to affect inactive cccDNA. In general, NUCs and pegIFN already do a very good job of inactivating cccDNA but this is not enough when HBsAg (from subviral particles) is still present, preventing immune control of active cccDNA and allowing latent cccDNA to reactivate.
Targeting chromatin function on cccDNA (which is derived entirely from the host) will also affect chromatin function in host chromosomes / genes. CBL137 (aka curaxin-137) is a well known inhibitor of histone function which has been in development as an anticancer agent by Incuron for several years. This compound also effects the activity of other important cellular proteins like p53 and NF-KB. This will almost certainly have an unacceptable toxicity profile in patients with chronic HBV infection.

@availlant

Nawab · 29 April 2025 13:43

Dear @availlant then what are the benefits of above approach to CHB?

availlant · 29 April 2025 15:11

Dear @Nawab,

Unfortunately, I don’t see any benefit of this approach when there are well tolerated agents (ETV, TDF and TAF) already approved which do the same thing.

There are unfortunately some important deficiencies in the cell paper which were not addressed during its peer review:

The cytotoxicity of CBL-137 (Figure S4 in this paper) was measured in an unrelated cancer cell line derived from kidney (HEK293T cells) instead of primary human hepatocytes (where the efficacy measurements were conducted). This is not an acceptable surrogate measure of cellular toxicity of this compound since this kidney cell line is already highly transformed (cancerous) and will be less sensitive to alterations in cellular metabolism than normal differentiated human hepatocytes. Even with this problem, toxicity of CBL-137 is apparent in HEK293T cells at concentrations which give effects in primary human hepatocytes.
No effort has been made by the authors to explore the role of alteration of p53 or NF-KB function in the effects of HBV transcript production. These are known activities of CBL-137 which are active at the concentrations at which reduction in HBV transcripts occur.
@availlant

Nawab · 30 April 2025 05:55

Dear @availlant, when will Replicor’s Passionate Program end? And will there be recruitment for phase III after that?