Sample Text Processor

Sample Text Processor

The Sample Text Processor is a simple skill designed to demonstrate the basic structure and functionality expected in the claude-skills ecosystem. Thi

Category: design Source: alirezarezvani/claude-skills

What Is Sample Text Processor?

The Sample Text Processor is a foundational skill in the claude-skills ecosystem that provides essential text analysis and transformation capabilities. Developed as a reference implementation by the Claude Skills Engineering Team, this skill demonstrates the fundamental structure, documentation practices, and functional design expected for new skills in the ecosystem. Written purely in Python using the standard library, it is categorized as a BASIC tier skill under text processing.

This skill performs common text operations such as word counting, character analysis, and simple text transformations. It is designed to process individual text files or batches of files within directories, generating output in both human-readable and JSON formats. The Sample Text Processor serves both practical and educational purposes, making it an excellent starting point for developers who aim to build and contribute their own skills to the claude-skills repository.

Why Use Sample Text Processor?

There are several compelling reasons to use the Sample Text Processor, whether as a utility tool or as a reference:

  • Reference Implementation: It exemplifies the recommended structure, documentation, and implementation patterns for claude-skills, making it invaluable for new contributors.
  • Simplicity and Clarity: The skill is intentionally designed to be straightforward, focusing on clear, maintainable code and minimal dependencies. This makes it easy to audit, extend, or adapt for new use cases.
  • Versatile Text Processing: Despite its simplicity, it offers a useful set of features for analyzing and transforming text, which can be directly utilized in content analysis, data preprocessing, or educational projects.
  • Dual Output Formats: By supporting both JSON and human-readable outputs, it caters to a variety of downstream consumers, from automated pipelines to manual reviewers.
  • No External Dependencies: Leveraging only the Python standard library ensures ease of deployment and broad compatibility across environments.

How to Get Started

To begin using the Sample Text Processor, follow these steps:

1. Clone the Repository

Retrieve the skill’s source code from its official repository:

git clone https://github.com/alirezarezvani/claude-skills.git
cd claude-skills/engineering/skill-tester/assets/sample-skill

2. Review the Code

Open the main Python script (commonly sample_text_processor.py or similar) to familiarize yourself with its architecture and entry points.

3. Run the Processor on a Text File

Assuming the script is named sample_text_processor.py, you can process a file as follows:

python sample_text_processor.py --input /path/to/file.txt --output-format json

This command processes the specified file and outputs the results in JSON format. To receive a human-readable report, use:

python sample_text_processor.py --input /path/to/file.txt --output-format human

4. Batch Processing

To analyze all .txt files in a directory, use the --input-dir flag (if supported):

python sample_text_processor.py --input-dir /path/to/directory --output-format json

5. Text Transformation

To apply a transformation, such as converting text to uppercase, add the appropriate flag:

python sample_text_processor.py --input file.txt --transform uppercase

Key Features

The Sample Text Processor provides several core and technical capabilities:

Word Count Analysis

  • Counts the total number of words
  • Identifies the number of unique words
  • Computes word frequency distribution

Character Statistics

  • Reports total character count
  • Counts the number of lines
  • Detects and enumerates special characters (punctuation, whitespace, etc.)

Text Transformations

  • Converts text to uppercase, lowercase, or title case, as specified by the user

File and Directory Processing

  • Processes individual files or entire directories (batch mode)
  • Automatically skips non-text files when processing directories

Dual Output Formats

  • Outputs results as structured JSON for programmatic consumption
  • Generates concise, human-readable summaries for manual review

Example Output

JSON Output:

{
  "word_count": 120,
  "unique_words": 80,
  "word_frequencies": {"sample": 10, "text": 8, "processor": 5},
  "character_count": 950,
  "line_count": 22,
  "special_characters": {"!": 3, "?": 2}
}

Human-Readable Output:

Total Words: 120
Unique Words: 80
Most Frequent Words:
  sample: 10
  text: 8
  processor: 5

Character Count: 950
Line Count: 22
Special Characters: ! (3), ? (2)

Best Practices

When using or extending the Sample Text Processor, consider the following best practices:

  • Maintain Code Readability: Follow the clear and modular patterns demonstrated in the reference code to ensure maintainability.
  • Extend with Purpose: When adding features, maintain the separation between analysis and transformation logic for ease of testing and reusability.
  • Documentation: Adhere to the documentation standards shown—ensure all new features are well documented in the code and accompanying markdown files.
  • Testing: Implement unit tests for any new functionality to preserve reliability, especially if contributing back to the claude-skills repository.
  • Follow Output Conventions: Preserve the dual-format output paradigm to maximize compatibility with other tools and workflows.

Important Notes

  • Intended as a Template: The Sample Text Processor is not optimized for performance or large-scale data. For heavy-duty processing, consider using or building on more advanced skills.
  • BASIC Tier Scope: The functionality is intentionally limited to meet the criteria for BASIC tier skills—future enhancements can be added carefully as needed.
  • Python Standard Library Only: No external dependencies are allowed. If you require advanced natural language processing, you may need to create a separate, higher-tier skill.
  • Repository Structure: When contributing, preserve the file and directory structure outlined in the reference implementation to maintain ecosystem consistency.
  • Skill Evolution: Monitor updates to the claude-skills guidelines, as best practices and interface requirements may evolve over time.

By leveraging and learning from the Sample Text Processor, developers can efficiently create robust, well-documented skills that integrate seamlessly with the claude-skills ecosystem.