Difference between revisions of "AI Models"

From UFRC
Jump to navigation Jump to search
Line 5: Line 5:
 
{{#get_web_data:url=https://data.rc.ufl.edu/pub/ufrc/data/ai_model_data.csv
 
{{#get_web_data:url=https://data.rc.ufl.edu/pub/ufrc/data/ai_model_data.csv
 
|format=CSV with header
 
|format=CSV with header
|data=dirpath=dirpath,dirsize=dirsize,name/url=name/url,version=version,license_txt=license_txt,date=date,categories=categories,description=description
+
|data=dirpath=dirpath,dirsize=dirsize,name_url=name_url,version=version,license=license,date=date,categories=categories,description=description
 
|cache seconds=4
 
|cache seconds=4
 
}}
 
}}
Line 20: Line 20:
 
{{#for_external_table:<nowiki/>
 
{{#for_external_table:<nowiki/>
 
{{!}}-
 
{{!}}-
{{!}} {{{name/url}}}
+
{{!}} {{{name_url}}}
 
{{!}} {{{categories}}}
 
{{!}} {{{categories}}}
 
{{!}} <code>{{{dirpath}}}</code>
 
{{!}} <code>{{{dirpath}}}</code>
 
{{!}} {{{dirsize}}}
 
{{!}} {{{dirsize}}}
 
{{!}} {{{version}}}
 
{{!}} {{{version}}}
{{!}} {{{license_txt}}}
+
{{!}} {{{license}}}
 
{{!}} {{{date}}}
 
{{!}} {{{date}}}
 
{{!}} {{{description}}}
 
{{!}} {{{description}}}
 
}}
 
}}
 
|}
 
|}

Revision as of 14:14, 21 May 2024

The UFIT Research Computing AI Support Team maintains a suite of commonly used AI models on HiPerGator. Users may copy these models to their own space, add modifications, and follow the instructions to run these jobs on HiPerGator. Each model has a readme file with additional information in its directory. Use https://support.rc.ufl.edu to submit tickets if you need help with the models or have any AI questions.


Name Categories Location on HiPerGator Dataset size (approximate) Version License Date added Description

Ultralytics YOLO Computer vision /data/ai/models/computer_vision/ultralytics_yolov8 605.2 MiB v8 AGPL-3.0 License: This OSI-approved open-source license is ideal for students and enthusiasts and Enterprise License: Designed for commercial use 5-May-24 YOLOv8 Detect, Segment and Pose models pretrained on the COCO dataset are available here, as well as YOLOv8 Classify models pretrained on the ImageNet dataset. Track mode is available for all Detect, Segment and Pose models.
alphafold Healthcare and life science /data/ai/models/healthcare_life_science/proteinfolding/alphafold 8.7 GiB v2.0.0 Apache License 2.0 6-Jul-22 Predicts protein structures. If you publish research using alphafold, the original paper must be cited.
RoseTTAFold Healthcare and life science /data/ai/models/healthcare_life_science/proteinfolding/rosettafold 1.0 GiB v1.0.0 MIT License 11-Mar-21 Predicts protein structures. If you publish research using RoseTTAFold, the original paper must be cited https://www.biorxiv.org/content/10.1101/2021.06.14.448402v
StyleGAN Imaging /data/ai/models/nvidia/stylegan3 7.2 GiB 3 Nvidia Source Code License 29-Apr-24 StyleGAN3 is a cutting-edge generative model for high-quality image synthesis, offering unparalleled control over image style and content, making it ideal for creative and enterprise applications.
CLIP Multimodal /data/ai/models/multimodel/clip/clip-vit-base-patch32 3.4 GiB openai/clip-vit-base-patch32 Apache License 2.0 17-Jul-23 The clip-vit-base-patch32 uses a ViT-B/32 Transformer architecture as an image encoder and uses a masked self-attention Transformer as a text encoder. These encoders are trained to maximize the similarity of (image, text) pairs via a contrastive loss.
BiomedCLIP Multimodal /data/ai/models/multimodel/clip/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 1.5 GiB microsoft/BiomedCLIP-PubMedBERT_256-vit_base_patch16_224 MIT License 17-Jul-23 BiomedCLIP is a biomedical vision-language foundation model that is pretrained on PMC-15M, a dataset of 15 million figure-caption pairs extracted from biomedical research articles in PubMed Central, using contrastive learning. It uses PubMedBERT as the text encoder and Vision Transformer as the image encoder, with domain-specific adaptations. It can perform various vision-language processing (VLP) tasks such as cross-modal retrieval, image classification, and visual question answering.
Gemma NLP /data/ai/models/nlp/gemma 250.1 GiB gemma 9-Apr-24 Gemma is a family of lightweight, state-of-the-art open models built from the same research and technology used to create the Gemini models. Developed by Google DeepMind and other teams across Google, Gemma is named after the Latin gemma, meaning "precious stone."
Llama NLP /data/ai/models/nlp/llama 6.7 TiB Llama2, Llama3 llama 19-Apr-24 LLaMA models are powerful language models developed by Meta AI, with the latest version being LLaMA 3, which significantly improves performance and accessibility for various natural language processing tasks.
Meditron NLP /data/ai/models/nlp/meditron 564.2 GiB llama 3-May-24 Meditron is a suite of open-source medical Large Language Models (LLMs). The team provide Meditron-7B and Meditron-70B, fine-tuned for medical tasks using a diverse medical dataset. Among these, Meditron-70B shows superior performance compared to other models like Llama-2-70B, GPT-3.5, and Flan-PaLM across multiple medical reasoning tasks.
Megatron-LM NLP /data/ai/models/nlp/megatron 22.3 GiB 2.2, 2.5, 3.0 Apache License 2.0 Megatron-LM, a fascinating language model developed by the Applied Deep Learning Research team at NVIDIA.
Mistral AI NLP /data/ai/models/nlp/mistral_ai 875.6 GiB Apache License 2.0 9-Apr-24 Mistral AI offers a variety of language models, including open-weights models like Mistral 7B, Mixtral 8x7B, and Mixtral 8x22B, as well as optimized commercial models such as Mistral Small, Mistral Medium, Mistral Large, and Mistral Embeddings
DNABERT NLP /data/ai/models/nvidia/bionemo/dnabert 64.3 GiB 1.2 28-Feb-24 DNABERT generates a dense representation of a genome sequence by identifying contextually similar sequences in the human genome. DNABert is a DNA sequence model trained on sequences from the human reference genome Hg38.p13.
Nemo_24.01_Gemma NLP /data/ai/models/nvidia/nemo/nemo_24.01.gemma 21.4 GiB NVIDIA AI Product Agreement 18-Apr-24 NeMo framework container with the pre-trained model Gemma.
Nemo_24.03_StarCoder2 NLP /data/ai/models/nvidia/nemo/nemo_24.01.starcoder2 22.6 GiB 2 NVIDIA AI Product Agreement 18-Apr-24 NeMo framework container with the pre-trained model StarCoder2.
Nemo_24.03_CodeGemma NLP /data/ai/models/nvidia/nemo/nemo_24.03.codegemma 20.2 GiB NVIDIA AI Product Agreement 18-Apr-24 NeMo framework container with the pre-trained model CodeGemma.
Gatortron NLP /data/ai/models/nlp/gatortron 50.2 GiB Apache License 2.0 3-May-24 GatorTron is a large clinical language model developed by researchers at the University of Florida Health in collaboration with NVIDIA. It’s designed to accelerate research and medical decision-making by extracting insights from massive volumes of clinical data with unprecedented speed and clarity.