selectors-scan

v1.0.13

Published

8 months ago

A CLI utility to scan and analyze CSS selectors (Python-powered, Node wrapper)

Downloads

0High
0Medium
0Low

amerja-klick

css selector analysis python cli

🎯 CSS Usage Mapper

A powerful Python tool that analyzes CSS selector usage across HTML files, helping you identify unused styles and optimize your CSS for better performance.

📋 Table of Contents

✨ Features

Comprehensive CSS Analysis: Extracts and analyzes all CSS selectors from multiple files
Smart Selector Matching: Uses multiple strategies to handle complex modern CSS selectors
Detailed Reporting: Generates 3 different CSV reports for various analysis needs
Complex Selector Support: Handles :is(), :where(), :has(), nested :not(), and more
Error Handling: Robust parsing with detailed error reporting
Performance Tracking: Detailed statistics including timing and complexity analysis
Fallback Strategies: Intelligent handling of selectors that BeautifulSoup can't parse directly
Web Scraping: Fetches HTML content from a URL specified in the .env file

🚀 Installation

Prerequisites

Python 3.7+
Virtual environment (recommended)

Setup

Clone or download the project to your local machine

Create and activate virtual environment:

python -m venv venv
# Windows
venv\Scripts\activate
# macOS/Linux
source venv/bin/activate

Install dependencies:

pip install beautifulsoup4 cssutils python-dotenv requests

Set up the .env file:
- Create a .env file in the root directory.
- Add the website URL:
```
WEBSITE_URL=https://websiteurl.com
```
Set up the config.json file:
- Create a config.json file in the root directory
- Add website pages path:
```
{
 "pages": [
    "about-us",
    "contact",
    "services"
 ]
}
```

🏃‍♂️ Quick Start

Place your files:
- The tool will automatically search for all CSS and HTML files in the project directory or fetch HTML from the specified URL.
Run the analysis:
```
python css_usage_mapper.py
```
View results:
- css_selector_usage.csv - Summary of all selectors
- css_selector_details.csv - Detailed selector-to-page mapping
- css_analysis_stats.csv - Comprehensive statistics

🔍 Matching Methods

The tool uses a sophisticated two-tier matching system to handle both simple and complex CSS selectors:

1. Direct Matching (Primary Method)

Uses BeautifulSoup's native soup.select(selector) method:

matches = soup.select(".header nav ul li")  # Direct CSS selector matching

Best for:

Simple class selectors (.class)
ID selectors (#id)
Element selectors (div, p)
Basic combinations (div.class, #id .class)
Standard pseudo-classes (:hover, :first-child)

2. Simplified Matching (Fallback Method)

When direct matching fails, the tool automatically tries simplified versions:

# Original: ".button:is(.primary,.secondary)"
# Simplified attempts:
# 1. ".button.primary"
# 2. ".button.secondary" 
# 3. ".button"
# 4. ".primary .secondary"

Handles:

:is() pseudo-class expansion
:not() removal
Pseudo-element stripping (::before → ``)
Class/ID extraction from complex selectors

3. Complex Selector Detection

Automatically identifies selectors that need special handling:

COMPLEX_PATTERNS = [
    r':is\(',           # :is(.a,.b)
    r':where\(',        # :where(.a,.b)  
    r':has\(',          # :has(.child)
    r':not\([^)]*\([^)]*\)', # nested :not()
    r'::',              # ::before, ::after
    r'@',               # @media, @keyframes
    r'\[.*\*=.*\]',     # complex attributes
]

4. Match Method Tracking

Each result shows which method succeeded:

direct - BeautifulSoup parsed the selector directly
simplified - Used fallback simplification strategy

📊 Output Files

1. `css_selector_usage.csv` - Summary Report

| Column | Description | |--------|-------------| | Selector | The CSS selector | | Used In HTML Files | Comma-separated list of files using this selector | | Match Count | Total number of matching elements across all files | | Match Method | direct or simplified | | Is Complex | Yes if selector uses modern CSS features |

2. `css_selector_details.csv` - Detailed Mapping

| Column | Description | |--------|-------------| | Selector | The CSS selector | | HTML File | Individual file using this selector (one row per file) | | Match Count | Number of matches in this specific file | | Match Method | How the selector was matched | | Is Complex | Complexity indicator | | Status | Used, Unused, or Invalid |

Use cases:

Find which pages use specific selectors
Identify selectors used only on certain pages
Filter by file for page-specific analysis

3. `css_analysis_stats.csv` - Statistics Report

Comprehensive metrics including:

Selector Counts: Total, used, unused, invalid
Complexity Analysis: Simple vs complex selectors
Performance Metrics: Processing time, match counts
Error Details: Failed selectors with error messages
Usage Rate: Percentage of selectors actually used

⚙️ Configuration

Folder Structure

project/
├── css_usage_mapper.py
├── venv/               # Virtual environment
└── output files...     # Generated CSV reports

The application uses a config.json file to specify the paths of the web pages to analyze. This file should be located in the root directory of the project.

`config.json` Structure

The config.json file contains a list of page paths that the application will use to construct full URLs for analysis. Each path corresponds to a specific page on the website.

Example `config.json`

{
  "pages": [
    "about-us",
    "contact",
    "services"
  ]
}

Explanation

pages: An array of strings, where each string is a path to a page on the website. The application will combine these paths with the base URL specified in the .env file to form complete URLs.

Generating Page Titles

The application automatically generates page titles from the paths by replacing hyphens with spaces and capitalizing each word. For example, the path about-us will be converted to the title "About Us".

Usage

Ensure that the config.json file is correctly formatted and placed in the root directory before running the application. The application will read this file to determine which pages to analyze.

🔧 Advanced Usage

Analyzing Specific File Types

The tool processes:

CSS files: .css extension
HTML files: .html extension

Performance Tips

Large Projects: Process files in batches for very large codebases
Memory Usage: The tool loads all files into memory - consider available RAM
Processing Time: Complex selectors take longer to analyze

Integration Examples

# Import as module
from css_usage_mapper import CSSUsageAnalyzer

analyzer = CSSUsageAnalyzer()
analyzer.run_analysis()

# Access statistics
print(f"Usage rate: {analyzer.stats['used_selectors']} / {analyzer.stats['total_selectors']}")

🛠️ Troubleshooting

Common Issues

Permission Error (Windows)

PermissionError: [Errno 13] Permission denied: 'css_selector_usage.csv'

Solution: Close Excel or any program that has the CSV files open.

No Selectors Found

Check that CSS files have changes by using Git.
Verify CSS files have .css extension
Check CSS syntax is valid

No Matches Found

Ensure the website has html pages.
Verify HTML files have .html extension
Check that HTML contains the expected class/ID names

High Invalid Selector Count

This is normal for modern CSS! Many advanced selectors aren't supported by BeautifulSoup but may still be functional in browsers.

Debugging Tips

Check logs: The tool provides detailed logging during execution
Review stats CSV: Contains error details for failed selectors
Verify file structure: Ensure correct folder organization

🔬 Technical Details

Dependencies

cssutils: CSS parsing and rule extraction
BeautifulSoup4: HTML parsing and CSS selector matching
requests: Fetching HTML content from URLs
dotenv: Loading environment variables
re: Regular expressions for complex selector handling
csv: Output file generation
logging: Progress tracking and error reporting

Performance Characteristics

Memory Usage: O(n + m) where n = CSS selectors, m = HTML elements
Time Complexity: O(n × m × f) where f = number of HTML files
Typical Processing: 6,000+ selectors across 3 files in ~2-3 minutes

Selector Complexity Handling

The tool categorizes selectors by complexity:

Simple Selectors (Direct matching):

.header { }
#navigation { }
div.content { }
ul li a { }

Complex Selectors (Simplified matching):

.button:is(.primary, .secondary) { }
.card:not(.hidden):has(.content) { }
.element::before { }

Accuracy Notes

Direct matches: 100% accurate (BeautifulSoup native support)
Simplified matches: ~90-95% accurate (best-effort approximation)
Invalid selectors: Logged for manual review

📈 Output Analysis Tips

Finding Unused CSS

# Filter for unused selectors
grep ",Unused" css_selector_details.csv

# Count usage by file
grep "Share Your Story" css_selector_details.csv | wc -l

Identifying Complex Selectors

# Find complex selectors that failed
grep ",Yes," css_selector_details.csv | grep "Invalid"

Performance Insights

High "simplified" method usage indicates modern CSS features
Low usage rates suggest opportunities for CSS cleanup
Invalid selectors may need manual browser testing

🚀 Ready to Optimize Your CSS?

This tool provides comprehensive insights into your CSS usage, helping you:

Remove unused styles for faster loading
Identify complex selectors that need testing
Understand style dependencies across pages
Optimize CSS architecture for better maintainability

🌐 Web Scraping for Analysis

The tool fetches HTML content from the URL specified in the .env file. Ensure the .env file is correctly set up with the WEBSITE_URL:

project/
├── css_usage_mapper.py
├── venv/               # Virtual environment
└── output files...     # Generated CSV reports

*This file needs to be reviewed again.

Node.js Wrapper

This package can now be installed and used directly with npm or npx.

# Install globally
npm install -g selectors-scan

# Run anywhere
selectors-scan --help

# Or run without installing
npx selectors-scan path/to/your/project

Prerequisite: Ensure Python 3 is installed and available in your PATH.