Scholar Research Skill

Search and analyze academic papers from open access sources with credibility scoring and detailed summaries.

When to Use

User wants to find papers on a specific topic
User needs credibility assessment of papers
User wants summarized research with methodology
User wants to track field evolution over time
User needs figures/tables extracted from top papers

Data Sources (Free/Open Access)

The skill searches across these sources: - arXiv - Pre-prints (Physics, Math, CS, q-bio, q-fin) - PubMed/PMC - Biomedical & Life sciences - DOAJ - Peer-reviewed OA journals (all disciplines) - OpenAlex - 250M+ papers metadata - CORE - Largest OA full-text aggregator - Semantic Scholar - Limited free tier - Unpaywall - Finds free versions of paywalled papers - CrossRef - All DOI metadata - bioRxiv - Biology pre-prints - medRxiv - Medicine pre-prints - Zenodo - EU research data/papers - HAL - French OA repository - J-STAGE - Japanese OA repository - SSRN - Economics, Law pre-prints

User-Added Sources

Users can add custom sources via config:

{
  "custom_sources": [
    {"name": "My University", "url": "https://repo.my.edu", "api": "..."}
  ]
}

Scoring System

Default Weights (Total: 100 + 40 bonus)

Paper Quality (100 points): | Factor | Weight | Description | |--------|--------|-------------| | citation_count | 15% | Times cited by other papers | | publication_recency | 10% | Newer = more relevant | | author_reputation | 12% | Combined h-index of authors | | journal_impact | 12% | Impact factor, CiteScore | | peer_review_status | 10% | Peer-reviewed vs pre-print | | open_access | 8% | Free to read/download | | retraction_status | 10% | Not retracted | | author_network | 8% | Connected to established network | | funder_acknowledgment | 5% | Clear funding sources | | reproducibility | 5% | Code/data available |

Bonus Points (up to +40): - Author Trust: +20 max - Journal Reputation: +20 max

Customizing Weights

Users can modify weights in config:

{
  "scoring": {
    "citation_count": 25,
    "publication_recency": 5
  }
}

Or use preset profiles: "strict", "recent_only", "balanced"

Output Format

Top Papers (default: 5, user-configurable)

[1] Paper Title (Year)
    Score: 95/100 | Citations: 234
    📄 PDF | 📊 Figures | 🔬 SI

    Summary: [One paragraph]

    Methodology: [Detailed breakdown]

Field Timeline

📈 FIELD TIMELINE (N papers)

2024: ████████████████████ 15 papers
       → Major: [Breakthrough 1]
       → Trend: [Trend 1]

2023: ████████████████ 12 papers
       → Major: [Breakthrough 2]

Credibility Distribution

📊 Credibility Distribution

Score 90-100: ██ (5) ★ Top
Score 70-89:  ████████ (15)
Score 50-69:  ██████████████████ (25)
Score 30-49:  ██████████ (10)
Score 0-29:   ██ (2)

[████████████░░░░░░░░░] Average: 58/100

Workflow

Search: Query across all enabled sources
Fetch: Download metadata + PDFs
Score: Calculate credibility scores
Sort: Rank by score + relevance
Present: Top N papers + timeline
Extract: Figures from top-scored papers (optional)

Usage Examples

Find papers on: machine learning
Fields: computer science, AI
Top papers: 5
Extract figures: true

Find papers on: quantum computing
Fields: physics
Top papers: 10
Extract figures: false

Dependencies

Python 3.8+
requests (API calls)
beautifulsoup4 (parsing)
pypdf2 (PDF extraction)
opencv-python (figure detection)
transformers (summarization)
matplotlib (visualization)

Configuration

See config.json for: - API keys - Source enable/disable - Scoring weights - Display preferences - Custom sources

Notes

Always prioritize open access sources
Cite sources in responses
Warn about pre-print limitations
Check retraction status when available
Respect rate limits

scholar-research

Installation