
5.1 AI And Programming
What is AI-Assisted Programming?
Before we get started, we need to set some boundaries since this is a topic that comes with a temptation and risk of plagiarism. Why offer it as a topic then? We see and hear AI is being used seemingly everywhere, and we cannot ignore it as an evolution of programming. There are numerous commercials highlighting the use of AI to help update the company's application in record time, analyze data, or other numerous buzzwords thrown out to hype up the company's products and abilities. There is no denying it, it is here, and it is being used. While considering this topic, I often thought how those developers felt when IDE's started to replace the basic text editors used to write code. Was there aprehension that developers would rely too heavily on the IDE's and lose that knowledge of the language, and lose the ability to write code without depending on the assistance the IDE provided? Here we are decades later, asking the same questions about what effect the advanced AI tech will have on our skillset.
AI is rapidly becoming an integral part of modern software development. Tools like GitHub Copilot, ChatGPT, and other AI-powered assistants are helping developers at all levels to improve productivity and reduce the time spent on repetitive tasks. These tools are especially beneficial for beginners who need guidance, for explaining or expanding documentation through discussion and examples, and provide alternative constructs for complex processes. Asking AI for help can overcome those few, poor, unrelated results from the search engines.
Let's discuss it for the benefits, while also discussing self-regulation, maintaining knowledge-boundaries, and considerations of its use. It is important that you keep developing and expanding your knowledge of the language, and set boundaries with the use of AI. This may be trying to solve the problem on your own at first, or reading through the documentation to find the parameters before asking the bot. If you think about it, AI will use the documentation and the many forum posts from Stack Exchange, Stack Overflow, Geonet and more to formulate answers. Every code help site is also riddled with bad code, either from the initial question, or from the many attepmts that fellow developers tried to suggest. AI is not perfect and trusting it 100% without context could have career ending consequences, or worse, it could cost lives.
The Role of AI in Modern Development
AI-assisted programming leverages artificial intelligence tools to support developers in various stages of writing, debugging, and understanding code. One of the primary advantages of coding with AI is the ability to streamline the debugging process, saving significant time and effort compared to manual research. AI tools offer valuable insights into your code, allowing developers to address issues more efficiently and return to coding more quickly. In addition to debugging, AI automates routine tasks, freeing developers to focus on more complex and creative challenges. This is particularly beneficial for tasks that involve repetitive or boilerplate code, where AI can quickly generate snippets and scripts, acting as a major time-saver.
Beyond its utility in coding tasks, AI also serves as a powerful learning aid. For developers at any level, AI simplifies complex coding concepts and breaks them down into digestible explanations. This is especially helpful for those encountering unfamiliar functions, libraries, or advanced topics. AI tools provide real-time assistance by translating dense technical documentation into more understandable language. For example, if a developer is struggling with a specific Python library like Pandas or Numpy, AI can offer simple explanations and practical applications, making the learning curve less steep and more approachable.
AI also fosters creativity and experimentation in coding by suggesting alternative approaches to problem solving. Developers can explore a variety of solutions, often discovering more efficient or innovative ways to write their code. Instant feedback from AI promotes rapid iteration, helping developers refine their techniques and adapt quickly to different coding styles. For instance, if a developer is unsure whether a loop or list comprehension is more suitable for a task, AI can not only suggest both but also explain the pros and cons of each, enriching the developer’s understanding of Python’s flexibility. However, it’s essential for developers to thoroughly review the output provided by AI, ensuring it aligns with their project requirements and best practices. For example:
is a loop or list comprehension better here?
1 2 3 4 5 | idList = [] with arcpy.da.SearchCursor(clipper, [field]) as cursor: for row in cursor: id = row[ 0 ] idList.append( id ) |
In your example, it's more efficient and concise to use list comprehension. List comprehensions are considered more Pythonic and are used where possible over simple for-loops as they are faster and more readable. So you could refactor the code to be:
1 2 | with arcpy.da.SearchCursor(clipper, [field]) as cursor: idList = [row[ 0 ] for row in cursor] |
This version of the code does exactly the same thing as your loop, but in a simpler and more readable way. It creates a list containing the first element of each row in the cursor.
A more powerful example can be seen with getting help with regex.
How can this be refactored? GT_Image[\d?]\|([a-zA-Z0-9]*-[a-zA-Z0-9]*-[a-zA-Z0-9]*-[a-zA-Z0-9]*-[a-zA-Z0-9]*)
Your regular expression GTV_Image[\d?]\|([a-zA-Z0-9]*-[a-zA-Z0-9]*-[a-zA-Z0-9]*-[a-zA-Z0-9]*-[a-zA-Z0-9]*) can be refactored for better readability and efficiency. Here’s a cleaned-up version:
GTV_Image\d?\|([a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}) Here's the breakdown:
- GT_Image\d?: Matches the literal string "GTV_Image" followed by an optional digit (\d?).
- \|: Matches the pipe character (|).
- ([a-zA-Z0-9]{8}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{4}-[a-zA-Z0-9]{12}): Matches a UUID pattern which is composed of:
- 8 alphanumeric characters
- 4 alphanumeric characters
- 4 alphanumeric characters
- 4 alphanumeric characters
- 12 alphanumeric characters
This snippet from AI can be copied as a code comment to help readers of your code understand what the regex is doing without leaving the script.
In terms of code review, AI plays a critical role by offering suggestions to improve and optimize code, enhancing both its performance and readability. By identifying potential pitfalls and pointing out runtime errors, AI helps mitigate risks early in the development cycle, contributing to the overall stability and reliability of the code. AI’s ability to provide feedback on coding standards and best practices enables developers to refine their skills and adopt industry-standard practices. For example, if a developer is writing inefficient code with unnecessary repetitions, AI could suggest refactoring techniques like DRY (Don’t Repeat Yourself) to improve the overall design.
When it comes to understanding technical documentation, AI can significantly ease the burden of deciphering dense and complex material. It breaks down intricate documentation into more manageable sections, offering relevant examples and context to illustrate how certain functions or libraries can be applied. This is especially useful when working with comprehensive frameworks like Django or Flask, where AI can help navigate documentation, clarify usage scenarios, and offer practical examples of implementation. By guiding developers through advanced topics, AI ensures that they not only understand the theory but also know how to apply it effectively in their projects. For example:
what are the parameters for this method? qgis.core.QgsVectorFileWriter.writeAsVectorFormatV3()
The method qgis.core.QgsVectorFileWriter.writeAsVectorFormatV3() that you are asking about takes the following parameters:
1 2 3 4 5 6 | qgis.core.QgsVectorFileWriter.writeAsVectorFormatV3( layer, # QgsVectorLayer object - the vector layer to be written filename, # string - the file path of the output file transformContext, # QgsCoordinateTransformContext object - the context for coordinate transform options # QgsVectorFileWriter.SaveVectorOptions object - options for saving the vector ) |
- layer: A QgsVectorLayer object that refers to the vector layer you're attempting to write. In your case, this is linealLayer, and arealLayer.
- filename: A string containing the file path where the output file should be written. In your code, this is linealFilename, and arealFilename.
- transformContext: A QgsCoordinateTransformContext object. This context is used for every transformation of vector data. In your case, you get this context from qgis.core.QgsProject.instance().transformContext().
- options: A QgsVectorFileWriter.SaveVectorOptions object that allows specifying numerous options for saving the vector, such as the driver to use and whether to overwrite an existing file. In your script, you set these options with options = qgis.core.QgsVectorFileWriter.SaveVectorOptions().
Surface-Level Understanding
One of the main risks of over-reliance on AI is that it can lead to a shallow understanding of coding principles. Developers might miss out on creating and expanding the problem-solving skills needed to tackle complex challenges. Developers that rely on AI-generated code without fully understanding it can lead to errors or inefficiencies in their work due to the selection of methods that AI put together as a solution. Arcpy contains some overlapping functions, but they don't always provide the same output. For example, arcpy.analysis.Select() and arcpy.conversion.ExportFeatures(). AI may not know the nuances between these two methods and chose an inappropriate method that causes you to write more code than necessary to make up for what the function did not do.
Mistakes and Misinterpretations
AI, while incredibly powerful, is not without its flaws. When it comes to coding, AI can sometimes generate code that is incorrect, suboptimal, or not entirely aligned with the developer’s intent. AI-generated code is based on patterns learned from vast amounts of data, but it doesn’t always understand the specific context of your project. This can lead to several issues:
- Incorrect Logic: AI might produce code that appears correct at first glance but contains logical errors that are not immediately obvious. For example, an AI might generate a loop or condition that doesn’t handle edge cases properly, leading to unexpected behavior.
- Security Vulnerabilities: AI tools might inadvertently introduce security flaws. For instance, an AI might suggest using outdated or insecure libraries, or it might create code that doesn’t properly sanitize user inputs, leading to potential security risks like SQL injection or cross-site scripting (XSS) attacks.
- Performance Issues: AI might generate code that works but is not optimized for performance. This can result in inefficient algorithms or unnecessary complexity that slows down your application or makes it harder to maintain.
Take this code comparing two dataframes from inequality across the multiple columns for comparison
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | # Sample DataFrames (df1 and df2) with multiple fields df1 = pd.DataFrame({ 'id' : [ 1 , 2 , 3 , 4 ], 'is_open' : [ True , False , True , True ], 'note' : [ 'note1' , 'note2' , 'note3' , 'note4' ], 'status' : [ 'active' , 'inactive' , 'pending' , 'active' ], 'priority' : [ 1 , 2 , 3 , 1 ] }) df2 = pd.DataFrame({ 'id' : [ 1 , 2 , 3 , 4 ], 'is_open' : [ True , True , True , False ], 'note' : [ 'note1' , 'note2_different' , 'note3' , 'note4' ], 'status' : [ 'active' , 'inactive' , 'pending' , 'closed' ], 'priority' : [ 1 , 2 , 3 , 2 ] }) # Merge the two DataFrames on 'id' merged_df = pd.merge(df1, df2, on = 'id' , suffixes = ( '_gis' , '_cis' )) |
1 2 3 4 5 6 7 8 9 | # Find rows where at least one of the fields differs expected_result_df = merged_df[ (merged_df[ 'is_open_gis' ] ! = merged_df[ 'is_open_cis' ]) | (merged_df[ 'note_gis' ] ! = merged_df[ 'note_cis' ]) | (merged_df[ 'status_gis' ] ! = merged_df[ 'status_cis' ]) | (merged_df[ 'priority_gis' ] ! = merged_df[ 'priority_cis' ]) ] print (expected_result_df) |
id is_open_gis note_gis ... note_cis status_cis priority_cis 1 2 False note2 ... note2_different inactive 2 3 4 True note4 ... note4 closed 2 [2 rows x 9 columns]
If you have many fields that you want to compare, you may ask AI to refractor it. AI may give you the snippet:
1 2 3 4 | fields = [ 'is_open' , 'note' , 'status' , 'priority' ] expected_result_df = df1[ df1[[f '{field}_gis' for field in fields]] ! = df1[[f '{field}_cis' for field in fields]]].dropna(how = 'all' ) |
Looks ok right? Copied into the script, doesn't raise any issues. Quickly reading it, it is simply using list comprehension to create the column comparisons but when ran, there is a problem that results in a KeyError:
Traceback (most recent call last): ... KeyError: "None of [Index(['is_open_gis', 'note_gis', 'status_gis', 'priority_gis'], dtype='object')] are in the [columns]" python-BaseException
Can you spot the error(s) in the refractored AI generated code?
Misunderstanding of Core Concepts
When developers rely on AI-generated explanations or code without deeper investigation, there’s a risk of misunderstanding core programming concepts. AI can provide explanations that are simplified or not entirely accurate, leading developers to develop a superficial understanding of how certain aspects of Python work. For example, AI might explain a complex concept like recursion in a way that glosses over its intricacies, leading to misconceptions. AI-generated code might not always align with industry best practices. For instance, AI might suggest using global variables instead of local variables, or it might generate code that is difficult to read and maintain. Without a solid understanding of best practices, developers might unknowingly adopt poor coding habits. AI tools might not fully grasp the specific requirements or constraints of a given project, leading to suggestions that are technically correct but contextually inappropriate. For instance, AI might generate code for a web application that works in a general sense but doesn’t comply with the specific security or scalability requirements of the project.
Another significant risk of relying too heavily on AI is the potential to overlook the critical processes of testing and debugging. Developers might assume that AI-generated code is flawless, leading them to skip or rush through the testing phase. This can result in undetected bugs that manifest in production, causing issues for end-users and increasing the cost and complexity of fixing them later. Going back to the dataframe comparison example, does it make sense to compare df1 to df1 to find the differences between df1 and df2? The second error that was caught is because the dataframe with the _gis and _cis suffixes is in the merged_df variable, not the df1 like AI coded.
Without verifying the code from AI is accurately doing what you need, you may introduce logical errors such as this into your code. Debugging is a crucial skill for any programmer, requiring a deep understanding of how code works and how to identify and fix issues. This can make them less capable of handling complex debugging tasks on their own.
Given the risks outlined above, it’s essential for developers to critically review and independently verify AI-generated code. Developers should treat AI-generated code as a starting point, not a final solution. They should review the code line by line, ensuring that it meets the specific needs of their project and adheres to best practices. This process helps identify any potential issues or areas for improvement. Rigorous testing is crucial to ensure that AI-generated code works as expected in all scenarios. Developers should write and run test cases, covering both typical and edge cases, to validate the code’s functionality and robustness.
Dependency
Another potential downfall is that developers may become too dependent on AI tools, which can hinder their ability to code independently or read and understand code written by someone else. Leaning too heavily on AI can lead to difficulty in working without AI assistance, particularly in environments where AI tools are not available. You may miss where you can make a change in the code that would make it run 75% faster. When you take the time to understand the language, you start to build these nuances into your code from the start and build the ability to think critically and creatively when solving coding challenges.
Balanced Approach
You probably noticed that this lesson content contradicted itself a lot. It said AI can provide deeper understanding of concepts and provide easy to understand explanations but then goes on to say AI may provide simplified explanations or inaccurate information. Until it gets better, each is true. It’s important to strike a balance between using AI tools and developing a deep understanding of the language and of the capabilities of AI. While AI can be a valuable resource, developers should use it to supplement their learning and development, rather than using it to replace their efforts of mastering the language. We learn more during the journey than we do at the finish line.