-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Feature/running base #415
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature/running base #415
Conversation
- Replace custom Japanese theme with a modern design system using OKLCH colors and clean UI components - Update wiki generation prompts to enforce full relative file paths for accurate references - Implement commit-hash based permalinks for source code citations to ensure link stability across branches - Enhance Markdown rendering to automatically parse and linkify file paths with line numbers - Refactor global CSS and ThemeProvider to use class-based dark mode strategy - Update landing page and component styles (TokenInput, WikiTypeSelector, ProcessedProjects) to match new design
Summary of ChangesHello @danielfrey63, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly enhances the application's capabilities by integrating Azure AI as a new embedding provider, improving the accuracy and functionality of source code citations within generated wiki pages, and introducing a 'force regeneration' option for better content control. Concurrently, the frontend has undergone a major styling overhaul to a modern design system, complemented by new utility functions for robust local storage handling and comprehensive logging. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
|
Please delete this pull request - my bad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request introduces support for Azure embedders by updating configuration files (api/config.py, api/config/embedder.json, api/tools/embedder.py) and the faiss-cpu dependency (api/pyproject.toml). It also significantly refactors the frontend, moving from a Japanese aesthetic CSS theme to a more modern, token-based design (src/app/globals.css, src/app/layout.tsx, src/app/page.tsx, src/components/ProcessedProjects.tsx, src/components/TokenInput.tsx, src/components/WikiTypeSelector.tsx).
Key functional changes include enhanced source citation formatting in prompts (api/prompts.py, api/websocket_wiki.py) to enforce full relative paths and line numbers, and improvements to the wiki generation process (src/app/[owner]/[repo]/page.tsx). The latter now includes content validation, logging, fetching of commit hashes for accurate source linking, and a new 'force regeneration' feature. A safeLocalStorage utility (src/utils/localStorage.ts) was added to mitigate a Node.js v25 bug, and a new logger utility (src/utils/logger.ts) was introduced for structured frontend logging. The Markdown component (src/components/Markdown.tsx) was updated to correctly render and link these new citation formats. The package-lock.json file was updated to reflect peer: true for several dependencies. Internationalization messages were also updated to include text for the new 'force regeneration' option.
Review comments highlighted that the embedder configuration logic in api/config.py could be simplified using a dictionary map for better scalability. Additionally, the logic for fetching commit hashes within src/app/[owner]/[repo]/page.tsx was noted as complex and duplicated, suggesting extraction into a reusable utility function. The generateFileUrl function in src/components/Markdown.tsx was identified as a duplicate of logic in src/app/[owner]/[repo]/page.tsx, recommending it be moved to a shared utility file to adhere to the DRY principle.
| return configs.get("embedder_google", {}) | ||
| elif embedder_type == 'ollama' and 'embedder_ollama' in configs: | ||
| return configs.get("embedder_ollama", {}) | ||
| elif embedder_type == 'azure' and 'embedder_azure' in configs: | ||
| return configs.get("embedder_azure", {}) | ||
| else: | ||
| return configs.get("embedder", {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This if/elif/else chain for selecting the embedder configuration can be simplified and made more scalable by using a dictionary mapping. This avoids adding a new elif block every time a new embedder type is introduced and makes the code easier to read and maintain.
embedder_type = EMBEDDER_TYPE
config_map = {
"google": "embedder_google",
"ollama": "embedder_ollama",
"azure": "embedder_azure",
}
config_key = config_map.get(embedder_type)
if config_key and config_key in configs:
return configs.get(config_key, {})
return configs.get("embedder", {})| // Fetch commit hash for cached wiki data | ||
| console.log('Attempting to fetch commit hash for cached wiki...'); | ||
| console.log('repoInfoForCommitHash.type:', repoInfoForCommitHash.type); | ||
| console.log('repoInfoForCommitHash.repoUrl:', repoInfoForCommitHash.repoUrl); | ||
| console.log('repoInfoForCommitHash.owner:', repoInfoForCommitHash.owner); | ||
| console.log('repoInfoForCommitHash.repo:', repoInfoForCommitHash.repo); | ||
|
|
||
| if (repoInfoForCommitHash.type === 'github' && repoInfoForCommitHash.repoUrl) { | ||
| const owner = repoInfoForCommitHash.owner; | ||
| const repo = repoInfoForCommitHash.repo; | ||
|
|
||
| console.log('Fetching commit hash for:', owner, '/', repo); | ||
|
|
||
| const getGithubApiUrl = (repoUrl: string): string => { | ||
| try { | ||
| const url = new URL(repoUrl); | ||
| const hostname = url.hostname; | ||
| if (hostname === 'github.com') { | ||
| return 'https://api.github.com'; | ||
| } | ||
| return `${url.protocol}//${hostname}/api/v3`; | ||
| } catch { | ||
| return 'https://api.github.com'; | ||
| } | ||
| }; | ||
|
|
||
| const createGithubHeaders = (token: string | null) => { | ||
| const headers: Record<string, string> = { | ||
| 'Accept': 'application/vnd.github.v3+json' | ||
| }; | ||
| if (token) { | ||
| headers['Authorization'] = `token ${token}`; | ||
| } | ||
| return headers; | ||
| }; | ||
|
|
||
| const githubApiBaseUrl = getGithubApiUrl(repoInfoForCommitHash.repoUrl!); | ||
| const currentToken = repoInfoForCommitHash.type === 'github' ? (typeof window !== 'undefined' ? localStorage.getItem('github_token') : null) : | ||
| repoInfoForCommitHash.type === 'gitlab' ? (typeof window !== 'undefined' ? localStorage.getItem('gitlab_token') : null) : null; | ||
|
|
||
| console.log('GitHub API URL:', githubApiBaseUrl); | ||
| console.log('Has token:', !!currentToken); | ||
|
|
||
| try { | ||
| const repoInfoResponse = await fetch(`${githubApiBaseUrl}/repos/${owner}/${repo}`, { | ||
| headers: createGithubHeaders(currentToken) | ||
| }); | ||
|
|
||
| console.log('Repo info response status:', repoInfoResponse.status); | ||
|
|
||
| if (repoInfoResponse.ok) { | ||
| const repoData = await repoInfoResponse.json(); | ||
| const defaultBranchLocal = repoData.default_branch; | ||
| setDefaultBranch(defaultBranchLocal || 'main'); | ||
| console.log('Default branch:', defaultBranchLocal); | ||
|
|
||
| // Get the latest commit hash | ||
| try { | ||
| const branchResponse = await fetch(`${githubApiBaseUrl}/repos/${owner}/${repo}/branches/${defaultBranchLocal || 'main'}`, { | ||
| headers: createGithubHeaders(currentToken) | ||
| }); | ||
|
|
||
| console.log('Branch response status:', branchResponse.status); | ||
|
|
||
| if (branchResponse.ok) { | ||
| const branchData = await branchResponse.json(); | ||
| const latestCommitHash = branchData.commit?.sha; | ||
| if (latestCommitHash) { | ||
| setCommitHash(latestCommitHash.substring(0, 8)); | ||
| console.log(`✅ Found commit hash for cached wiki: ${latestCommitHash.substring(0, 8)}`); | ||
| } else { | ||
| console.warn('No commit SHA in branch data'); | ||
| } | ||
| } else { | ||
| console.warn('Branch response not OK:', await branchResponse.text()); | ||
| } | ||
| } catch (err) { | ||
| console.error('Error fetching commit hash for cached wiki:', err); | ||
| } | ||
| } else { | ||
| console.warn('Repo info response not OK:', await repoInfoResponse.text()); | ||
| } | ||
| } catch (err) { | ||
| console.error('Error fetching repository info for cached wiki:', err); | ||
| } | ||
| } else { | ||
| console.log('Skipping commit hash fetch - not a GitHub repo or missing repoUrl'); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This block of code for fetching the commit hash when loading from cache is quite large and complex to be inside a useEffect. It also duplicates some logic for creating GitHub API URLs and headers from fetchRepositoryStructure. Consider extracting this logic into a dedicated, reusable utility function, e.g., fetchRepoDetails(repoInfo, token), which could return the default branch and commit hash. This would significantly improve the readability and maintainability of this useEffect hook.
| const generateFileUrl = (filePath: string, lineStart?: number, lineEnd?: number): string => { | ||
| if (!repoUrl) return ''; | ||
|
|
||
| try { | ||
| const url = new URL(repoUrl); | ||
| const hostname = url.hostname; | ||
| let fileUrl = ''; | ||
|
|
||
| // Use commit hash if available, otherwise use default branch | ||
| const ref = commitHash || defaultBranch; | ||
|
|
||
| if (hostname === 'github.com' || hostname.includes('github')) { | ||
| // GitHub URL format: https://github.com/owner/repo/blob/commit-hash/path#L1-L50 | ||
| fileUrl = `${repoUrl}/blob/${ref}/${filePath}`; | ||
| if (lineStart) { | ||
| fileUrl += `#L${lineStart}`; | ||
| if (lineEnd && lineEnd !== lineStart) { | ||
| fileUrl += `-L${lineEnd}`; | ||
| } | ||
| } | ||
| return fileUrl; | ||
| } else if (hostname === 'gitlab.com' || hostname.includes('gitlab')) { | ||
| // GitLab URL format: https://gitlab.com/owner/repo/-/blob/commit-hash/path#L1-50 | ||
| fileUrl = `${repoUrl}/-/blob/${ref}/${filePath}`; | ||
| if (lineStart) { | ||
| fileUrl += `#L${lineStart}`; | ||
| if (lineEnd && lineEnd !== lineStart) { | ||
| fileUrl += `-${lineEnd}`; | ||
| } | ||
| } | ||
| return fileUrl; | ||
| } else if (hostname === 'bitbucket.org' || hostname.includes('bitbucket')) { | ||
| // Bitbucket URL format: https://bitbucket.org/owner/repo/src/commit-hash/path#lines-1:50 | ||
| fileUrl = `${repoUrl}/src/${ref}/${filePath}`; | ||
| if (lineStart) { | ||
| fileUrl += `#lines-${lineStart}`; | ||
| if (lineEnd && lineEnd !== lineStart) { | ||
| fileUrl += `:${lineEnd}`; | ||
| } | ||
| } | ||
| return fileUrl; | ||
| } | ||
| } catch (error) { | ||
| console.warn('Error generating file URL:', error); | ||
| } | ||
|
|
||
| return ''; | ||
| }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generateFileUrl function is duplicated from src/app/[owner]/[repo]/page.tsx. To adhere to the DRY (Don't Repeat Yourself) principle, this function should be moved to a shared utility file (e.g., src/utils/generateFileUrl.ts) and imported where needed. This will make maintenance easier and prevent inconsistencies.
No description provided.