企業級客服支援系統是現代企業營運不可或缺的基礎設施,它不僅是客戶與企業溝通的橋樑,更是收集產品問題、優化服務品質、累積知識資產的關鍵平台。一個設計良好的 Helpdesk 系統能夠顯著提升客戶滿意度,降低支援成本,並透過資料分析發現產品與服務的改進機會。然而,建構一個真正高效的客服系統絕非簡單的工單管理,它需要考慮多角色的權限控管、複雜的業務流程、與既有系統的整合、高效的搜尋機制,以及良好的使用者體驗設計。本文將深入探討企業級 Helpdesk 系統的完整架構,從需求分析、功能設計、資料庫建模到技術實作,系統性地呈現一個專業客服系統應該具備的各項要素。特別是在搜尋功能方面,我們將詳細介紹 Apache Solr 這個業界領先的全文搜尋引擎,從基礎概念到實際部署配置,協助讀者建構強大的企業級搜尋能力。

Helpdesk 系統架構與功能設計

企業級 Helpdesk 系統的核心在於有效管理客戶問題的完整生命週期,從問題提交、分類派工、處理追蹤到最終解決與知識沉澱。系統的設計需要平衡多方需求,既要讓客戶能夠輕鬆提交問題並即時追蹤進度,也要讓支援團隊能夠高效處理工單,同時為管理層提供全面的統計分析能力。在架構設計上,我們採用分層架構,將展示層、業務邏輯層、資料訪問層與資料儲存層清晰分離,確保系統的可維護性與可擴展性。

使用者角色與權限管理是系統的基礎。在典型的 Helpdesk 系統中,我們定義了三種主要角色,每種角色擁有不同的權限與操作範圍。管理員作為系統的超級使用者,負責系統的整體配置、使用者管理、產品目錄維護與資料稽核。客戶是問題的提出者,他們可以查看自己購買的產品、提交支援工單、追蹤處理進度,以及參與知識社群的討論。支援工程師是問題的解決者,他們可以查看所有待處理的工單、更新工單狀態、提供解決方案,以及將常見問題整理為知識庫文章。

工單管理是系統的核心功能。一個完整的工單包含了問題描述、優先級、類別、相關產品、處理歷程、附件檔案等多個維度的資訊。工單的生命週期從建立開始,經過待分派、處理中、待確認、已解決到最終歸檔,每個狀態轉換都應該有明確的觸發條件與通知機制。工單的分派可以採用自動派工演算法,根據工程師的專長領域、當前工作負載與工單的優先級進行智慧分配。工單的升級機制確保高優先級或逾期未處理的問題能夠得到及時關注。

from flask import Flask, request, jsonify, session
from flask_sqlalchemy import SQLAlchemy
from werkzeug.security import generate_password_hash, check_password_hash
from datetime import datetime, timedelta
from enum import Enum
import logging

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

app = Flask(__name__)
app.config['SQLALCHEMY_DATABASE_URI'] = 'postgresql://user:pass@localhost/helpdesk'
app.config['SQLALCHEMY_TRACK_MODIFICATIONS'] = False
app.config['SECRET_KEY'] = 'your-secret-key'

db = SQLAlchemy(app)

# 列舉類型定義
class UserRole(Enum):
    """使用者角色"""
    ADMIN = 'admin'
    CUSTOMER = 'customer'
    SUPPORT_ENGINEER = 'support_engineer'

class TicketStatus(Enum):
    """工單狀態"""
    NEW = 'new'
    ASSIGNED = 'assigned'
    IN_PROGRESS = 'in_progress'
    PENDING_CUSTOMER = 'pending_customer'
    RESOLVED = 'resolved'
    CLOSED = 'closed'

class TicketPriority(Enum):
    """工單優先級"""
    LOW = 'low'
    MEDIUM = 'medium'
    HIGH = 'high'
    CRITICAL = 'critical'

class TicketCategory(Enum):
    """工單類別"""
    TECHNICAL = 'technical'
    BILLING = 'billing'
    ACCOUNT = 'account'
    FEATURE_REQUEST = 'feature_request'
    OTHER = 'other'

# 資料模型定義
class User(db.Model):
    """使用者資料模型"""
    __tablename__ = 'users'
    
    id = db.Column(db.Integer, primary_key=True)
    username = db.Column(db.String(80), unique=True, nullable=False)
    email = db.Column(db.String(120), unique=True, nullable=False)
    password_hash = db.Column(db.String(255), nullable=False)
    full_name = db.Column(db.String(120))
    role = db.Column(db.Enum(UserRole), nullable=False)
    is_active = db.Column(db.Boolean, default=True)
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    last_login = db.Column(db.DateTime)
    
    # 關聯關係
    owned_tickets = db.relationship('Ticket', 
                                   foreign_keys='Ticket.customer_id',
                                   backref='customer',
                                   lazy='dynamic')
    assigned_tickets = db.relationship('Ticket',
                                      foreign_keys='Ticket.assigned_to_id',
                                      backref='assigned_engineer',
                                      lazy='dynamic')
    owned_products = db.relationship('CustomerProduct',
                                    backref='customer',
                                    lazy='dynamic')
    
    def set_password(self, password):
        """設定密碼雜湊"""
        self.password_hash = generate_password_hash(password)
    
    def check_password(self, password):
        """驗證密碼"""
        return check_password_hash(self.password_hash, password)
    
    def to_dict(self):
        """轉換為字典"""
        return {
            'id': self.id,
            'username': self.username,
            'email': self.email,
            'full_name': self.full_name,
            'role': self.role.value,
            'is_active': self.is_active,
            'created_at': self.created_at.isoformat()
        }

class Product(db.Model):
    """產品資料模型"""
    __tablename__ = 'products'
    
    id = db.Column(db.Integer, primary_key=True)
    name = db.Column(db.String(200), nullable=False)
    description = db.Column(db.Text)
    category = db.Column(db.String(100))
    is_active = db.Column(db.Boolean, default=True)
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    
    # 關聯關係
    customer_products = db.relationship('CustomerProduct',
                                       backref='product',
                                       lazy='dynamic')
    tickets = db.relationship('Ticket',
                             backref='product',
                             lazy='dynamic')

class CustomerProduct(db.Model):
    """客戶產品關聯資料模型"""
    __tablename__ = 'customer_products'
    
    id = db.Column(db.Integer, primary_key=True)
    customer_id = db.Column(db.Integer, db.ForeignKey('users.id'), nullable=False)
    product_id = db.Column(db.Integer, db.ForeignKey('products.id'), nullable=False)
    purchase_date = db.Column(db.DateTime, default=datetime.utcnow)
    support_end_date = db.Column(db.DateTime)
    is_active = db.Column(db.Boolean, default=True)

class Ticket(db.Model):
    """工單資料模型"""
    __tablename__ = 'tickets'
    
    id = db.Column(db.Integer, primary_key=True)
    ticket_number = db.Column(db.String(20), unique=True, nullable=False)
    title = db.Column(db.String(200), nullable=False)
    description = db.Column(db.Text, nullable=False)
    
    # 關聯欄位
    customer_id = db.Column(db.Integer, db.ForeignKey('users.id'), nullable=False)
    product_id = db.Column(db.Integer, db.ForeignKey('products.id'))
    assigned_to_id = db.Column(db.Integer, db.ForeignKey('users.id'))
    
    # 狀態與優先級
    status = db.Column(db.Enum(TicketStatus), default=TicketStatus.NEW, nullable=False)
    priority = db.Column(db.Enum(TicketPriority), default=TicketPriority.MEDIUM, nullable=False)
    category = db.Column(db.Enum(TicketCategory), nullable=False)
    
    # 時間戳記
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    updated_at = db.Column(db.DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
    assigned_at = db.Column(db.DateTime)
    resolved_at = db.Column(db.DateTime)
    closed_at = db.Column(db.DateTime)
    
    # 關聯關係
    comments = db.relationship('TicketComment',
                              backref='ticket',
                              lazy='dynamic',
                              order_by='TicketComment.created_at')
    attachments = db.relationship('TicketAttachment',
                                 backref='ticket',
                                 lazy='dynamic')
    
    def generate_ticket_number(self):
        """生成工單編號"""
        prefix = 'TKT'
        date_str = datetime.utcnow().strftime('%Y%m%d')
        count = Ticket.query.filter(
            Ticket.ticket_number.like(f'{prefix}{date_str}%')
        ).count()
        self.ticket_number = f'{prefix}{date_str}{count+1:04d}'
    
    def to_dict(self):
        """轉換為字典"""
        return {
            'id': self.id,
            'ticket_number': self.ticket_number,
            'title': self.title,
            'description': self.description,
            'customer': self.customer.to_dict() if self.customer else None,
            'product': self.product.name if self.product else None,
            'assigned_to': self.assigned_engineer.full_name if self.assigned_engineer else None,
            'status': self.status.value,
            'priority': self.priority.value,
            'category': self.category.value,
            'created_at': self.created_at.isoformat(),
            'updated_at': self.updated_at.isoformat()
        }

class TicketComment(db.Model):
    """工單留言資料模型"""
    __tablename__ = 'ticket_comments'
    
    id = db.Column(db.Integer, primary_key=True)
    ticket_id = db.Column(db.Integer, db.ForeignKey('tickets.id'), nullable=False)
    user_id = db.Column(db.Integer, db.ForeignKey('users.id'), nullable=False)
    content = db.Column(db.Text, nullable=False)
    is_internal = db.Column(db.Boolean, default=False)  # 內部備註不對客戶顯示
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    
    # 關聯關係
    user = db.relationship('User', backref='comments')

class TicketAttachment(db.Model):
    """工單附件資料模型"""
    __tablename__ = 'ticket_attachments'
    
    id = db.Column(db.Integer, primary_key=True)
    ticket_id = db.Column(db.Integer, db.ForeignKey('tickets.id'), nullable=False)
    filename = db.Column(db.String(255), nullable=False)
    file_path = db.Column(db.String(500), nullable=False)
    file_size = db.Column(db.Integer)
    mime_type = db.Column(db.String(100))
    uploaded_by_id = db.Column(db.Integer, db.ForeignKey('users.id'))
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    
    # 關聯關係
    uploaded_by = db.relationship('User')

class ForumPost(db.Model):
    """討論區文章資料模型"""
    __tablename__ = 'forum_posts'
    
    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.String(200), nullable=False)
    content = db.Column(db.Text, nullable=False)
    author_id = db.Column(db.Integer, db.ForeignKey('users.id'), nullable=False)
    category = db.Column(db.String(100))
    view_count = db.Column(db.Integer, default=0)
    is_pinned = db.Column(db.Boolean, default=False)
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    updated_at = db.Column(db.DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
    
    # 關聯關係
    author = db.relationship('User', backref='forum_posts')
    replies = db.relationship('ForumReply',
                             backref='post',
                             lazy='dynamic',
                             order_by='ForumReply.created_at')

class ForumReply(db.Model):
    """討論區回覆資料模型"""
    __tablename__ = 'forum_replies'
    
    id = db.Column(db.Integer, primary_key=True)
    post_id = db.Column(db.Integer, db.ForeignKey('forum_posts.id'), nullable=False)
    author_id = db.Column(db.Integer, db.ForeignKey('users.id'), nullable=False)
    content = db.Column(db.Text, nullable=False)
    created_at = db.Column(db.DateTime, default=datetime.utcnow)
    
    # 關聯關係
    author = db.relationship('User', backref='forum_replies')

# API 端點實作
@app.route('/api/auth/login', methods=['POST'])
def login():
    """使用者登入"""
    try:
        data = request.json
        username = data.get('username')
        password = data.get('password')
        
        if not username or not password:
            return jsonify({'error': 'Username and password required'}), 400
        
        user = User.query.filter_by(username=username).first()
        
        if not user or not user.is_active:
            return jsonify({'error': 'Invalid credentials'}), 401
        
        if not user.check_password(password):
            return jsonify({'error': 'Invalid credentials'}), 401
        
        # 更新最後登入時間
        user.last_login = datetime.utcnow()
        db.session.commit()
        
        # 建立 Session
        session['user_id'] = user.id
        session['user_role'] = user.role.value
        
        logger.info(f"User logged in: {user.username}")
        
        return jsonify({
            'message': 'Login successful',
            'user': user.to_dict()
        }), 200
        
    except Exception as e:
        logger.error(f"Login error: {e}")
        return jsonify({'error': 'Internal server error'}), 500

@app.route('/api/tickets', methods=['POST'])
def create_ticket():
    """建立工單"""
    try:
        # 驗證使用者登入
        if 'user_id' not in session:
            return jsonify({'error': 'Unauthorized'}), 401
        
        user_id = session['user_id']
        data = request.json
        
        # 驗證必填欄位
        required_fields = ['title', 'description', 'category', 'product_id']
        for field in required_fields:
            if field not in data:
                return jsonify({'error': f'{field} is required'}), 400
        
        # 驗證產品所有權
        customer_product = CustomerProduct.query.filter_by(
            customer_id=user_id,
            product_id=data['product_id'],
            is_active=True
        ).first()
        
        if not customer_product:
            return jsonify({'error': 'Product not found or not owned'}), 403
        
        # 建立工單
        ticket = Ticket(
            title=data['title'],
            description=data['description'],
            customer_id=user_id,
            product_id=data['product_id'],
            category=TicketCategory[data['category'].upper()],
            priority=TicketPriority[data.get('priority', 'MEDIUM').upper()]
        )
        
        ticket.generate_ticket_number()
        
        db.session.add(ticket)
        db.session.commit()
        
        logger.info(f"Ticket created: {ticket.ticket_number}")
        
        return jsonify({
            'message': 'Ticket created successfully',
            'ticket': ticket.to_dict()
        }), 201
        
    except KeyError as e:
        return jsonify({'error': f'Invalid value: {e}'}), 400
    except Exception as e:
        logger.error(f"Error creating ticket: {e}")
        db.session.rollback()
        return jsonify({'error': 'Internal server error'}), 500

@app.route('/api/tickets/<int:ticket_id>', methods=['GET'])
def get_ticket(ticket_id):
    """查詢工單詳情"""
    try:
        if 'user_id' not in session:
            return jsonify({'error': 'Unauthorized'}), 401
        
        user_id = session['user_id']
        user_role = session['user_role']
        
        ticket = Ticket.query.get(ticket_id)
        
        if not ticket:
            return jsonify({'error': 'Ticket not found'}), 404
        
        # 權限檢查
        if user_role == UserRole.CUSTOMER.value and ticket.customer_id != user_id:
            return jsonify({'error': 'Unauthorized'}), 403
        
        # 包含留言
        comments = [{
            'id': comment.id,
            'user': comment.user.full_name,
            'content': comment.content,
            'is_internal': comment.is_internal,
            'created_at': comment.created_at.isoformat()
        } for comment in ticket.comments if not comment.is_internal or user_role != UserRole.CUSTOMER.value]
        
        response = ticket.to_dict()
        response['comments'] = comments
        
        return jsonify(response), 200
        
    except Exception as e:
        logger.error(f"Error retrieving ticket: {e}")
        return jsonify({'error': 'Internal server error'}), 500

@app.route('/api/tickets/<int:ticket_id>/comments', methods=['POST'])
def add_ticket_comment(ticket_id):
    """新增工單留言"""
    try:
        if 'user_id' not in session:
            return jsonify({'error': 'Unauthorized'}), 401
        
        user_id = session['user_id']
        user_role = session['user_role']
        data = request.json
        
        if 'content' not in data:
            return jsonify({'error': 'Content is required'}), 400
        
        ticket = Ticket.query.get(ticket_id)
        
        if not ticket:
            return jsonify({'error': 'Ticket not found'}), 404
        
        # 權限檢查
        if user_role == UserRole.CUSTOMER.value and ticket.customer_id != user_id:
            return jsonify({'error': 'Unauthorized'}), 403
        
        # 建立留言
        comment = TicketComment(
            ticket_id=ticket_id,
            user_id=user_id,
            content=data['content'],
            is_internal=data.get('is_internal', False) and user_role != UserRole.CUSTOMER.value
        )
        
        db.session.add(comment)
        
        # 更新工單時間戳記
        ticket.updated_at = datetime.utcnow()
        
        db.session.commit()
        
        logger.info(f"Comment added to ticket {ticket.ticket_number}")
        
        return jsonify({'message': 'Comment added successfully'}), 201
        
    except Exception as e:
        logger.error(f"Error adding comment: {e}")
        db.session.rollback()
        return jsonify({'error': 'Internal server error'}), 500

@app.route('/api/tickets/<int:ticket_id>/assign', methods=['POST'])
def assign_ticket(ticket_id):
    """分派工單"""
    try:
        if 'user_id' not in session:
            return jsonify({'error': 'Unauthorized'}), 401
        
        user_role = session['user_role']
        
        # 只有管理員與支援工程師可以分派工單
        if user_role not in [UserRole.ADMIN.value, UserRole.SUPPORT_ENGINEER.value]:
            return jsonify({'error': 'Unauthorized'}), 403
        
        data = request.json
        engineer_id = data.get('engineer_id')
        
        if not engineer_id:
            return jsonify({'error': 'Engineer ID is required'}), 400
        
        ticket = Ticket.query.get(ticket_id)
        
        if not ticket:
            return jsonify({'error': 'Ticket not found'}), 404
        
        # 驗證工程師存在且為支援工程師角色
        engineer = User.query.filter_by(
            id=engineer_id,
            role=UserRole.SUPPORT_ENGINEER,
            is_active=True
        ).first()
        
        if not engineer:
            return jsonify({'error': 'Invalid engineer'}), 400
        
        # 分派工單
        ticket.assigned_to_id = engineer_id
        ticket.assigned_at = datetime.utcnow()
        ticket.status = TicketStatus.ASSIGNED
        
        db.session.commit()
        
        logger.info(f"Ticket {ticket.ticket_number} assigned to {engineer.username}")
        
        return jsonify({
            'message': 'Ticket assigned successfully',
            'ticket': ticket.to_dict()
        }), 200
        
    except Exception as e:
        logger.error(f"Error assigning ticket: {e}")
        db.session.rollback()
        return jsonify({'error': 'Internal server error'}), 500

if __name__ == '__main__':
    with app.app_context():
        db.create_all()
    app.run(host='0.0.0.0', port=5000, debug=True)

這個完整的 Helpdesk 系統後端實作展示了從資料模型定義到 API 端點實作的完整流程。資料模型設計考慮了實務需求,包含使用者管理、產品管理、工單管理、留言與附件等多個面向。API 端點提供了登入驗證、工單建立與查詢、留言新增、工單分派等核心功能,並實作了適當的權限控管與錯誤處理。

@startuml
!define PLANTUML_FORMAT svg
!theme _none_

skinparam dpi auto
skinparam shadowing false
skinparam linetype ortho
skinparam roundcorner 5
skinparam defaultFontName "Microsoft JhengHei UI"
skinparam defaultFontSize 16
skinparam minClassWidth 140

package "Helpdesk 系統架構" {
  
  package "展示層" {
    rectangle "Web 前端" as web
    rectangle "管理後台" as admin_ui
    rectangle "行動應用" as mobile
  }
  
  package "業務邏輯層" {
    rectangle "認證授權服務" as auth
    rectangle "工單管理服務" as ticket_service
    rectangle "使用者管理服務" as user_service
    rectangle "產品管理服務" as product_service
    rectangle "討論區服務" as forum_service
    rectangle "搜尋服務" as search_service
  }
  
  package "資料訪問層" {
    rectangle "資料庫 DAO" as db_dao
    rectangle "快取 DAO" as cache_dao
    rectangle "搜尋引擎 DAO" as search_dao
  }
  
  package "資料儲存層" {
    database "PostgreSQL\n主資料庫" as postgres
    database "Redis\n快取層" as redis
    database "Solr\n搜尋引擎" as solr
    folder "檔案儲存" as files
  }
  
  package "整合層" {
    rectangle "郵件服務" as email
    rectangle "簡訊通知" as sms
    rectangle "第三方系統整合" as integration
  }
}

web -down-> auth
admin_ui -down-> auth
mobile -down-> auth

auth -down-> db_dao
ticket_service -down-> db_dao
user_service -down-> db_dao
product_service -down-> db_dao
forum_service -down-> db_dao
search_service -down-> search_dao

ticket_service -down-> cache_dao
user_service -down-> cache_dao

db_dao -down-> postgres
cache_dao -down-> redis
search_dao -down-> solr

ticket_service -down-> files
ticket_service --> email
ticket_service --> sms

@enduml

Apache Solr 全文搜尋引擎實戰

Apache Solr 是基於 Apache Lucene 構建的企業級搜尋平台,提供了強大的全文搜尋、多面向搜尋、即時索引、分散式搜尋等功能。相較於傳統的資料庫模糊查詢,Solr 能夠提供更快速、更精確且更具彈性的搜尋能力。Solr 採用倒排索引(Inverted Index)技術,將文件的內容分解為詞彙,並建立詞彙到文件的映射,使得搜尋時能夠快速定位包含特定詞彙的文件。

在 Helpdesk 系統中整合 Solr 能夠顯著提升搜尋體驗。使用者可以透過關鍵字快速找到相關的工單、知識庫文章與討論區貼文。Solr 的多面向搜尋功能允許使用者透過產品類別、工單狀態、時間範圍等維度進行篩選。高亮顯示功能能夠在搜尋結果中突顯匹配的關鍵字,提升可讀性。拼寫建議與自動完成功能則能夠引導使用者輸入正確的查詢詞彙。

Solr 的架構由多個元件組成。核心(Core)是 Solr 的基本工作單元,每個核心包含獨立的索引與配置。Schema 定義了文件的結構,包括欄位類型、欄位定義、分詞器配置等。索引器負責將文件轉換為索引結構,查詢解析器則將使用者的查詢轉換為內部的查詢物件。結果排序器根據相關性分數對搜尋結果進行排序。快取機制則提升了重複查詢的效能。

#!/bin/bash
# Solr 安裝與配置腳本 - CentOS 7/8

set -e

echo "=========================================="
echo "Apache Solr 安裝腳本"
echo "=========================================="

# 系統需求檢查
echo "檢查系統需求..."

# 檢查記憶體
total_mem=$(free -m | awk '/^Mem:/{print $2}')
if [ $total_mem -lt 1024 ]; then
    echo "警告: 系統記憶體少於 1GB,建議至少 2GB"
fi

# 安裝 Java
echo "檢查 Java 環境..."
if ! command -v java &> /dev/null; then
    echo "Java 未安裝,開始安裝 OpenJDK 11..."
    sudo yum install -y java-11-openjdk java-11-openjdk-devel
else
    echo "Java 已安裝: $(java -version 2>&1 | head -n 1)"
fi

# 下載 Solr
SOLR_VERSION="9.4.0"
SOLR_MIRROR="https://archive.apache.org/dist/lucene/solr"
SOLR_ARCHIVE="solr-${SOLR_VERSION}.tgz"

echo "下載 Solr ${SOLR_VERSION}..."
if [ ! -f "$SOLR_ARCHIVE" ]; then
    wget "${SOLR_MIRROR}/${SOLR_VERSION}/${SOLR_ARCHIVE}"
else
    echo "安裝檔案已存在,跳過下載"
fi

# 解壓縮
echo "解壓縮 Solr..."
tar xzf $SOLR_ARCHIVE

# 執行安裝腳本
echo "安裝 Solr 服務..."
sudo bash solr-${SOLR_VERSION}/bin/install_solr_service.sh $SOLR_ARCHIVE

# 啟動 Solr 服務
echo "啟動 Solr 服務..."
sudo systemctl start solr
sudo systemctl enable solr

# 檢查服務狀態
echo "檢查 Solr 服務狀態..."
sudo systemctl status solr

# 建立 Helpdesk 核心
echo "建立 Helpdesk 搜尋核心..."
sudo su - solr -c "/opt/solr/bin/solr create -c helpdesk -n basic_configs"

echo "=========================================="
echo "Solr 安裝完成!"
echo "Web 管理介面: http://localhost:8983/solr"
echo "=========================================="

# 配置防火牆(如果啟用)
if sudo systemctl is-active --quiet firewalld; then
    echo "配置防火牆規則..."
    sudo firewall-cmd --permanent --add-port=8983/tcp
    sudo firewall-cmd --reload
fi

echo "安裝腳本執行完成"

Solr 的配置檔案是其靈活性的關鍵。solrconfig.xml 定義了核心的運行時行為,包括請求處理器、查詢處理器、快取配置、更新處理器等。managed-schema 定義了文件的結構,包括欄位定義、欄位類型、複製欄位、動態欄位等。對於中文搜尋,需要配置適當的中文分詞器,如 IK Analyzer、SmartCN 等。

<?xml version="1.0" encoding="UTF-8" ?>
<!-- Helpdesk 系統 Solr Schema 配置 -->
<schema name="helpdesk-schema" version="1.6">
    
    <!-- 欄位類型定義 -->
    <fieldType name="string" class="solr.StrField" sortMissingLast="true" />
    
    <fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
        <analyzer type="query">
            <tokenizer class="solr.StandardTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
        </analyzer>
    </fieldType>
    
    <!-- 中文文字類型 -->
    <fieldType name="text_cn" class="solr.TextField" positionIncrementGap="100">
        <analyzer type="index">
            <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory" words="stopwords_cn.txt" />
        </analyzer>
        <analyzer type="query">
            <tokenizer class="org.apache.lucene.analysis.cn.smart.HMMChineseTokenizerFactory"/>
            <filter class="solr.LowerCaseFilterFactory"/>
            <filter class="solr.StopFilterFactory" words="stopwords_cn.txt" />
            <filter class="solr.SynonymGraphFilterFactory" synonyms="synonyms_cn.txt" />
        </analyzer>
    </fieldType>
    
    <fieldType name="int" class="solr.IntPointField" docValues="true"/>
    <fieldType name="long" class="solr.LongPointField" docValues="true"/>
    <fieldType name="date" class="solr.DatePointField" docValues="true"/>
    <fieldType name="boolean" class="solr.BoolField" sortMissingLast="true"/>
    
    <!-- 欄位定義 -->
    
    <!-- 必填欄位 -->
    <field name="id" type="string" indexed="true" stored="true" required="true" />
    <field name="_version_" type="long" indexed="false" stored="false"/>
    
    <!-- 工單欄位 -->
    <field name="ticket_number" type="string" indexed="true" stored="true" />
    <field name="title" type="text_cn" indexed="true" stored="true" />
    <field name="description" type="text_cn" indexed="true" stored="true" />
    <field name="customer_name" type="string" indexed="true" stored="true" />
    <field name="product_name" type="string" indexed="true" stored="true" />
    <field name="status" type="string" indexed="true" stored="true" />
    <field name="priority" type="string" indexed="true" stored="true" />
    <field name="category" type="string" indexed="true" stored="true" />
    <field name="assigned_to" type="string" indexed="true" stored="true" />
    <field name="created_at" type="date" indexed="true" stored="true" />
    <field name="updated_at" type="date" indexed="true" stored="true" />
    
    <!-- 知識庫欄位 -->
    <field name="kb_title" type="text_cn" indexed="true" stored="true" />
    <field name="kb_content" type="text_cn" indexed="true" stored="true" />
    <field name="kb_category" type="string" indexed="true" stored="true" />
    <field name="kb_tags" type="string" indexed="true" stored="true" multiValued="true" />
    
    <!-- 討論區欄位 -->
    <field name="forum_title" type="text_cn" indexed="true" stored="true" />
    <field name="forum_content" type="text_cn" indexed="true" stored="true" />
    <field name="forum_author" type="string" indexed="true" stored="true" />
    
    <!-- 文件類型 -->
    <field name="doc_type" type="string" indexed="true" stored="true" />
    
    <!-- 全文搜尋欄位(複製欄位) -->
    <field name="text" type="text_cn" indexed="true" stored="false" multiValued="true" />
    
    <!-- 複製欄位配置 -->
    <copyField source="title" dest="text"/>
    <copyField source="description" dest="text"/>
    <copyField source="kb_title" dest="text"/>
    <copyField source="kb_content" dest="text"/>
    <copyField source="forum_title" dest="text"/>
    <copyField source="forum_content" dest="text"/>
    
    <!-- 唯一鍵 -->
    <uniqueKey>id</uniqueKey>
    
</schema>

資料導入處理器(Data Import Handler, DIH)是 Solr 提供的強大工具,能夠從關聯式資料庫、XML 檔案、CSV 檔案等資料來源批次導入資料到 Solr 索引。DIH 支援全量導入與增量導入兩種模式。全量導入會重新索引所有資料,適合初次建立索引或資料結構發生重大變更時使用。增量導入僅索引自上次導入以來變更的資料,適合定期更新索引使用。

<?xml version="1.0" encoding="UTF-8" ?>
<!-- Data Import Handler 配置 -->
<dataConfig>
    
    <!-- 資料來源配置 -->
    <dataSource 
        type="JdbcDataSource"
        driver="org.postgresql.Driver"
        url="jdbc:postgresql://localhost:5432/helpdesk"
        user="helpdesk_user"
        password="your_password"
        readOnly="true"
        autoCommit="false"
        batchSize="1000"
        transactionIsolation="TRANSACTION_READ_COMMITTED" />
    
    <!-- 工單實體配置 -->
    <document name="tickets">
        <entity 
            name="ticket"
            query="SELECT 
                t.id,
                t.ticket_number,
                t.title,
                t.description,
                t.status,
                t.priority,
                t.category,
                t.created_at,
                t.updated_at,
                u.full_name as customer_name,
                p.name as product_name,
                e.full_name as assigned_to
            FROM tickets t
            LEFT JOIN users u ON t.customer_id = u.id
            LEFT JOIN products p ON t.product_id = p.id
            LEFT JOIN users e ON t.assigned_to_id = e.id
            WHERE t.id > 0"
            
            deltaImportQuery="SELECT 
                t.id,
                t.ticket_number,
                t.title,
                t.description,
                t.status,
                t.priority,
                t.category,
                t.created_at,
                t.updated_at,
                u.full_name as customer_name,
                p.name as product_name,
                e.full_name as assigned_to
            FROM tickets t
            LEFT JOIN users u ON t.customer_id = u.id
            LEFT JOIN products p ON t.product_id = p.id
            LEFT JOIN users e ON t.assigned_to_id = e.id
            WHERE t.id = '${dih.delta.id}'"
            
            deltaQuery="SELECT id FROM tickets 
                WHERE updated_at > '${dih.last_index_time}'"
            
            transformer="RegexTransformer,DateFormatTransformer">
            
            <!-- 欄位對映 -->
            <field column="id" name="id" />
            <field column="ticket_number" name="ticket_number" />
            <field column="title" name="title" />
            <field column="description" name="description" />
            <field column="customer_name" name="customer_name" />
            <field column="product_name" name="product_name" />
            <field column="status" name="status" />
            <field column="priority" name="priority" />
            <field column="category" name="category" />
            <field column="assigned_to" name="assigned_to" />
            <field column="created_at" name="created_at" 
                   dateTimeFormat="yyyy-MM-dd HH:mm:ss" />
            <field column="updated_at" name="updated_at" 
                   dateTimeFormat="yyyy-MM-dd HH:mm:ss" />
            
            <!-- 文件類型標記 -->
            <field column="ticket" name="doc_type" />
            
            <!-- 子實體:工單留言 -->
            <entity name="comments"
                    query="SELECT content FROM ticket_comments 
                           WHERE ticket_id = '${ticket.id}' 
                           AND is_internal = false">
                <field column="content" name="description" />
            </entity>
        </entity>
    </document>
    
    <!-- 知識庫實體配置 -->
    <document name="knowledge_base">
        <entity 
            name="kb_article"
            query="SELECT 
                id,
                title as kb_title,
                content as kb_content,
                category as kb_category,
                tags as kb_tags,
                created_at,
                updated_at
            FROM knowledge_base_articles
            WHERE is_published = true"
            
            deltaQuery="SELECT id FROM knowledge_base_articles 
                WHERE updated_at > '${dih.last_index_time}' 
                AND is_published = true">
            
            <field column="id" name="id" />
            <field column="kb_title" name="kb_title" />
            <field column="kb_content" name="kb_content" />
            <field column="kb_category" name="kb_category" />
            <field column="kb_tags" name="kb_tags" />
            <field column="created_at" name="created_at" 
                   dateTimeFormat="yyyy-MM-dd HH:mm:ss" />
            <field column="updated_at" name="updated_at" 
                   dateTimeFormat="yyyy-MM-dd HH:mm:ss" />
            <field column="knowledge_base" name="doc_type" />
        </entity>
    </document>
    
</dataConfig>

Solr 的查詢介面非常豐富,支援多種查詢語法與參數。標準查詢解析器支援 Lucene 查詢語法,包括詞彙搜尋、片語搜尋、布林運算、範圍查詢、萬用字元查詢等。DisMax 與 eDisMax 查詢解析器則提供了更友善的使用者介面,能夠處理自然語言查詢。面向搜尋允許使用者透過多個維度對搜尋結果進行篩選與統計。高亮功能能夠在搜尋結果中突顯匹配的關鍵字。拼寫檢查與建議功能則能夠糾正使用者的拼寫錯誤。

import pysolr
import logging
from typing import List, Dict, Optional
from datetime import datetime

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

class SolrSearchService:
    """Solr 搜尋服務封裝"""
    
    def __init__(self, solr_url='http://localhost:8983/solr/helpdesk'):
        """
        初始化 Solr 連線
        
        Parameters:
        -----------
        solr_url : str
            Solr 核心的 URL
        """
        self.solr = pysolr.Solr(solr_url, always_commit=True, timeout=10)
        logger.info(f"Connected to Solr: {solr_url}")
    
    def index_ticket(self, ticket_data: Dict):
        """
        索引單一工單
        
        Parameters:
        -----------
        ticket_data : dict
            工單資料
        """
        try:
            # 準備索引文件
            doc = {
                'id': f"ticket_{ticket_data['id']}",
                'doc_type': 'ticket',
                'ticket_number': ticket_data['ticket_number'],
                'title': ticket_data['title'],
                'description': ticket_data['description'],
                'customer_name': ticket_data.get('customer_name', ''),
                'product_name': ticket_data.get('product_name', ''),
                'status': ticket_data['status'],
                'priority': ticket_data['priority'],
                'category': ticket_data['category'],
                'assigned_to': ticket_data.get('assigned_to', ''),
                'created_at': ticket_data['created_at'].isoformat() + 'Z',
                'updated_at': ticket_data['updated_at'].isoformat() + 'Z'
            }
            
            # 索引文件
            self.solr.add([doc])
            
            logger.info(f"Indexed ticket: {ticket_data['ticket_number']}")
            
        except Exception as e:
            logger.error(f"Error indexing ticket: {e}")
            raise
    
    def search_tickets(self, 
                      query: str,
                      filters: Optional[Dict] = None,
                      start: int = 0,
                      rows: int = 10,
                      sort: Optional[str] = None) -> Dict:
        """
        搜尋工單
        
        Parameters:
        -----------
        query : str
            搜尋關鍵字
        filters : dict
            篩選條件
        start : int
            起始位置
        rows : int
            返回數量
        sort : str
            排序方式
        
        Returns:
        --------
        dict
            搜尋結果
        """
        try:
            # 建構查詢參數
            search_params = {
                'q': query if query else '*:*',
                'fq': 'doc_type:ticket',
                'start': start,
                'rows': rows,
                'hl': 'true',
                'hl.fl': 'title,description',
                'hl.simple.pre': '<mark>',
                'hl.simple.post': '</mark>',
                'hl.fragsize': 200
            }
            
            # 添加篩選條件
            if filters:
                filter_queries = []
                
                if 'status' in filters:
                    filter_queries.append(f"status:{filters['status']}")
                
                if 'priority' in filters:
                    filter_queries.append(f"priority:{filters['priority']}")
                
                if 'category' in filters:
                    filter_queries.append(f"category:{filters['category']}")
                
                if 'date_from' in filters:
                    date_from = filters['date_from'].isoformat() + 'Z'
                    filter_queries.append(f"created_at:[{date_from} TO *]")
                
                if 'date_to' in filters:
                    date_to = filters['date_to'].isoformat() + 'Z'
                    filter_queries.append(f"created_at:[* TO {date_to}]")
                
                if filter_queries:
                    search_params['fq'] = [search_params['fq']] + filter_queries
            
            # 添加排序
            if sort:
                search_params['sort'] = sort
            else:
                search_params['sort'] = 'updated_at desc'
            
            # 執行搜尋
            results = self.solr.search(**search_params)
            
            # 處理高亮結果
            highlighted = results.highlighting
            
            # 格式化結果
            formatted_results = {
                'total': results.hits,
                'start': start,
                'rows': rows,
                'results': []
            }
            
            for doc in results.docs:
                ticket_id = doc['id']
                result_item = dict(doc)
                
                # 添加高亮片段
                if ticket_id in highlighted:
                    if 'title' in highlighted[ticket_id]:
                        result_item['title_highlight'] = highlighted[ticket_id]['title'][0]
                    if 'description' in highlighted[ticket_id]:
                        result_item['description_highlight'] = highlighted[ticket_id]['description'][0]
                
                formatted_results['results'].append(result_item)
            
            logger.info(f"Search completed: {results.hits} results found")
            
            return formatted_results
            
        except Exception as e:
            logger.error(f"Search error: {e}")
            raise
    
    def faceted_search(self, query: str) -> Dict:
        """
        面向搜尋
        返回各維度的統計資訊
        
        Parameters:
        -----------
        query : str
            搜尋關鍵字
        
        Returns:
        --------
        dict
            面向統計結果
        """
        try:
            search_params = {
                'q': query if query else '*:*',
                'fq': 'doc_type:ticket',
                'rows': 0,
                'facet': 'true',
                'facet.field': ['status', 'priority', 'category', 'product_name'],
                'facet.mincount': 1
            }
            
            results = self.solr.search(**search_params)
            
            facets = {
                'status': dict(results.facets['facet_fields']['status']),
                'priority': dict(results.facets['facet_fields']['priority']),
                'category': dict(results.facets['facet_fields']['category']),
                'product': dict(results.facets['facet_fields']['product_name'])
            }
            
            return {
                'total': results.hits,
                'facets': facets
            }
            
        except Exception as e:
            logger.error(f"Faceted search error: {e}")
            raise
    
    def suggest_spelling(self, query: str) -> List[str]:
        """
        拼寫建議
        
        Parameters:
        -----------
        query : str
            查詢詞彙
        
        Returns:
        --------
        list
            建議詞彙列表
        """
        try:
            search_params = {
                'q': query,
                'spellcheck': 'true',
                'spellcheck.q': query,
                'spellcheck.count': 5
            }
            
            results = self.solr.search(**search_params)
            
            suggestions = []
            if 'spellcheck' in results.raw_response:
                spellcheck_data = results.raw_response['spellcheck']
                if 'suggestions' in spellcheck_data:
                    for item in spellcheck_data['suggestions']:
                        if isinstance(item, dict) and 'suggestion' in item:
                            suggestions.extend(item['suggestion'])
            
            return suggestions
            
        except Exception as e:
            logger.error(f"Spelling suggestion error: {e}")
            return []
    
    def delete_ticket(self, ticket_id: int):
        """
        刪除工單索引
        
        Parameters:
        -----------
        ticket_id : int
            工單 ID
        """
        try:
            self.solr.delete(id=f"ticket_{ticket_id}")
            logger.info(f"Deleted ticket index: {ticket_id}")
            
        except Exception as e:
            logger.error(f"Error deleting ticket index: {e}")
            raise
    
    def optimize_index(self):
        """
        優化索引
        合併索引段,提升查詢效能
        """
        try:
            self.solr.optimize()
            logger.info("Index optimization completed")
            
        except Exception as e:
            logger.error(f"Index optimization error: {e}")
            raise

# 使用範例
if __name__ == "__main__":
    
    # 初始化搜尋服務
    search_service = SolrSearchService()
    
    # 範例:搜尋工單
    results = search_service.search_tickets(
        query="網路連線問題",
        filters={
            'status': 'new',
            'priority': 'high'
        },
        rows=20
    )
    
    print(f"找到 {results['total']} 筆結果")
    for result in results['results']:
        print(f"工單編號: {result['ticket_number']}")
        print(f"標題: {result.get('title_highlight', result['title'])}")
        print("-" * 60)
    
    # 範例:面向搜尋
    facets = search_service.faceted_search("網路")
    print("\n面向統計:")
    print(f"狀態分佈: {facets['facets']['status']}")
    print(f"優先級分佈: {facets['facets']['priority']}")

這個完整的 Solr 搜尋服務封裝展示了如何在 Python 應用中整合 Solr。從文件索引、全文搜尋、面向搜尋到拼寫建議,提供了完整的搜尋功能實作。高亮功能讓使用者能夠快速定位匹配的關鍵字,篩選條件與排序功能則提供了靈活的搜尋控制。

玄貓認為,一個優秀的企業級 Helpdesk 系統不僅是技術的堆疊,更是對業務流程的深刻理解與使用者體驗的精心設計。從使用者角色的權限管理到工單流程的精細控制,從知識庫的累積到搜尋功能的優化,每個細節都影響著系統的實用性與使用者滿意度。Apache Solr 的整合為系統帶來了強大的搜尋能力,但也需要投入精力進行索引設計、查詢優化與效能調校。在實際部署中,還需要考慮高可用性架構、資料備份策略、監控告警機制等生產環境的需求。只有將技術實作與業務需求緊密結合,才能建構出真正有價值的企業級客服支援系統。