AI in Shortcuts: Unleashing Apple’s “Use Model” Action for Productivity

The landscape of personal computing is undergoing a significant transformation, propelled by the integration of artificial intelligence directly into our everyday tools. Apple’s latest advancements, particularly within its robust Shortcuts automation platform, offer users unprecedented access to AI models through a groundbreaking feature: the Use Model action. This innovation promises to unlock new frontiers in productivity, allowing complex, data-driven tasks to be automated with a sophistication previously unimaginable. However, as with any nascent technology, the reality of implementing AI into practical workflows reveals both immense power and surprising imperfections.

This article delves into the practical application of Apple’s AI models within Shortcuts, drawing insights from real-world experiments in two distinct areas: automated image description for web accessibility and streamlined expense receipt processing. These explorations highlight the iterative nature of prompt engineering, the nuances of AI model behavior, and the critical importance of human oversight in achieving reliable automation.

UNLEASHING AI IN SHORTCUTS: THE USE MODEL ACTION

At the heart of Apple’s latest automation capabilities lies the Use Model action in Shortcuts. This powerful new component acts as a gateway to various artificial intelligence models, allowing users to feed data into an AI and receive processed output. The flexibility of this action is a cornerstone of its appeal, as it supports different deployment environments for the AI models:

  • On-Device Models: These models run directly on your Mac, iPhone, or iPad, offering maximum privacy and often quicker processing times for tasks that don’t require extensive computational resources or vast datasets. They are ideal for sensitive data or scenarios where internet connectivity might be limited.
  • Private Cloud Compute Servers: For more demanding tasks that require greater computational power or access to larger models, Apple’s Private Cloud Compute offers a secure, privacy-preserving solution. Data sent to these servers is encrypted and processed in a way that prevents Apple from accessing it directly, striking a balance between power and privacy.
  • Third-Party AI Services: The architecture also allows for integration with external AI providers like OpenAI, expanding the range of capabilities to include models specialized in various domains, though this option may come with different privacy considerations.

The fundamental principle remains consistent: you provide the AI with input, such as text, images, or documents, and it returns a transformed or analyzed output based on its training. This capability opens the door to automating tasks that traditionally required human cognition or highly complex, rule-based programming, marking a significant leap forward for macOS automation and beyond.

AI FOR IMAGE DESCRIPTIONS: THE ALT TEXT CHALLENGE

One of the most immediate and beneficial applications of AI is in generating descriptions for images. For web publishers, creating accurate and concise alt text for accessibility is a time-consuming but crucial task. The promise of an AI-powered Shortcut to automate this seemed too good to pass up.

The initial experiment involved taking an image and passing it to Apple’s Private Cloud Compute model to generate a description. Early challenges quickly emerged:

PROCESSING LARGE IMAGES

Sending full-resolution images to the cloud model resulted in significant processing delays. The solution was straightforward: resize the image within the Shortcut workflow before sending it to the AI. This pre-processing step drastically improved efficiency without compromising the quality of the description, as the model primarily needed content for understanding, not high fidelity.

THE ART OF PROMPT ENGINEERING

While the AI generated remarkably accurate descriptions, they were often too verbose or contained problematic characters (like double quotes) that would break HTML alt attribute tags. This necessitated an iterative process of prompt engineering—refining the instructions given to the AI model. For instance, an initial prompt like “Describe this image” evolved into:

“Describe this image for use in the alt text tag on a webpage. Limit yourself to two sentences at most. If it’s a screenshot, please include all the words. Don’t use double quotes, but only single quotes.”

Despite these precise instructions, the AI’s responses were often inconsistent. Length constraints were frequently ignored, and double quotes stubbornly persisted. This highlights a fundamental characteristic of generative AI: while powerful, it doesn’t always adhere to rigid rules in the way a traditional computer program would.

THE HYBRID SOLUTION AND HUMAN OVERSIGHT

To overcome these inconsistencies, a hybrid approach proved most effective. The Shortcut was designed to:

  • Send the image to the Private Cloud Compute model with the primary descriptive prompt.
  • Receive the description and check its character count.
  • If too long, pass the text to Apple’s On-Device model with a prompt like “Shorten this text to be less than 250 characters.”
  • Finally, use native Shortcuts actions to search for and replace any remaining double quotes with single quotes.

Even with this sophisticated multi-stage processing, human intervention remained indispensable. The automation would generate the alt text, but a human still reviewed and adjusted it. This “human-in-the-loop” approach is crucial, as AI models, despite their sophistication, can produce unexpected errors—from inexplicably adding hashtags to misreporting the time in a screenshot description. This experiment underscored that AI is best viewed as a powerful assistant, not a fully autonomous agent, especially for tasks requiring precision and accuracy for external consumption.

STREAMLINING EXPENSES WITH AI: A PROMPT ENGINEERING SAGA

Another compelling use case for AI in Shortcuts lies in automating the tedious process of expense tracking. Manually extracting details like vendor name, date, and dollar amount from receipts (often buried in PDFs or emails) is a common pain point. The idea of an AI model intelligently parsing these documents for key information presented a significant opportunity for productivity gains.

The goal was to extract specific data points from a receipt and format them for direct entry into a spreadsheet, specifically Apple Numbers. Initial attempts involved simply pulling all text and asking the AI for comma-separated values to paste into a spreadsheet. However, pasting delimited text into multiple Numbers columns proved problematic.

DATA FORMATTING FOR SPREADSHEETS

A critical discovery was that the “Add Row to Numbers Spreadsheet” action in Shortcuts works optimally with a list of items. The refined workflow involved asking the AI to provide comma-separated values for the date, vendor, and amount, and then using Shortcuts’ built-in Split Text action to convert these into a proper list, ready for direct insertion into Numbers.

THE COMPLEXITIES OF A MULTI-PART PROMPT

To achieve the desired output, the AI prompt for expense processing became surprisingly detailed and complex, aiming to extract not only the spreadsheet data but also to generate a filename based on the receipt details. A typical prompt might look like this:

“This information is a receipt for payment. Please return the amount of the expense in US dollars, the date of the expense (look for dates in email headers if it’s formatted as such, otherwise use the current date), and who the vendor was (use the subject line and from line in email headers if formatted as such). Also, create a new filename in the format: YYYY-MM-DD-[first five alphanumeric characters of vendor in uppercase]-[full dollar amount with no decimal point or dollar sign]

Format the values as follows, separated by commas:

date in MM/DD/YYYY,vendor,full dollar amount (make sure to include any decimal point but not the string USD),filename”

AI’S PECULIARITIES: CONFUSION AND INCONSISTENCY

Despite the meticulous prompt, the AI exhibited some peculiar and frustrating behaviors:

  • Conflicting Instructions: Asking for the dollar amount in two different formats (e.g., “$209.49” for the spreadsheet and “20949” for the filename) often confused the model, leading to incorrect values being entered into the spreadsheet, such as “20949” instead of “$209.49″—a small error with potentially large financial implications.
  • Counting Errors: Requests for specific character counts (e.g., “first five alphanumeric characters of vendor”) were frequently misinterpreted, with the AI sometimes returning six characters instead of five.
  • Non-Determinism: Perhaps the most vexing issue was the AI’s lack of repeatability. Running the exact same input through the Shortcut multiple times could yield different, sometimes wildly incorrect, results. This non-deterministic nature fundamentally clashes with traditional programming paradigms, where the same input always produces the same output.

This experiment highlighted the inherent unreliability of current AI models for tasks requiring absolute precision. While the AI could perform the general task of extracting information, its inconsistencies meant the automated workflow essentially produced a draft that still required careful human vetting. The question then becomes: if you must double-check everything, does it truly save time?

THE PARADOX OF AI IN AUTOMATION: POWER AND IMPERFECTION

The experiences with both image description and expense tracking reveal a crucial paradox in the current state of AI-powered automation: it is simultaneously incredibly powerful and surprisingly fallible. The “Use Model” action in Shortcuts is a testament to Apple’s commitment to bringing advanced AI capabilities to the mainstream, but it also exposes the growing pains of this technology.

AMAZING YET STUPID

The phrase “amazing piece of technology and incredibly stupid” perfectly encapsulates the current reality. AI can understand complex natural language instructions, discern objects in images, and extract relevant information from unstructured text—feats that would have been science fiction just a few years ago. Yet, it struggles with seemingly simple tasks like consistently counting characters or adhering strictly to formatting rules despite explicit instructions. This highlights the difference between pattern recognition and true logical reasoning.

PROMPT ENGINEERING AS A NEW SKILL

The iterative process of refining AI prompts has become an essential new skill for anyone looking to leverage these models effectively. It’s less about traditional coding and more about clear, unambiguous communication with a non-human entity that interprets instructions based on probabilistic models rather than strict rules. The effectiveness of an AI-powered Shortcut often hinges on the quality and specificity of its prompt.

THE RELIABILITY DILEMMA

The non-deterministic nature of AI models poses a significant challenge for mission-critical automation. If the same input can yield different outputs, how can one trust the system for tasks where accuracy is paramount, such as financial record-keeping or accessibility compliance? This inherent variability means that for the foreseeable future, a “human-in-the-loop” approach will remain vital, particularly for workflows where errors could have significant consequences.

This reality may also explain why Apple has adopted a cautious, iterative approach to rolling out some of its more ambitious “Apple Intelligence” features. Building user trust requires not just impressive demos, but reliable, consistent performance in real-world scenarios.

BEST PRACTICES FOR AI-POWERED SHORTCUTS

Despite the current limitations, the potential of AI in Shortcuts is undeniable. To maximize its utility while mitigating its quirks, consider these best practices:

  • Start Simple and Iterate: Begin with straightforward tasks and gradually increase complexity. Test your prompts extensively with diverse inputs.
  • Master Prompt Engineering: Experiment with different phrasings, include examples, and be as explicit as possible about desired outputs (e.g., “return as a list,” “limit to X characters,” “no punctuation”).
  • Leverage Native Shortcuts Actions: Don’t rely solely on AI. Use traditional Shortcuts actions for tasks where deterministic behavior is required, such as resizing images, splitting text, replacing characters, or checking length. This hybrid approach adds robustness.
  • Build in Human Review: For any critical workflow, assume the AI will make mistakes. Always incorporate a step for human validation and correction before the output is finalized or used.
  • Understand Model Limitations: Be aware that AI struggles with precise counting, strict formatting, and complex logical reasoning. Design your workflows to compensate for these weaknesses.
  • Choose the Right Model: Utilize on-device models for privacy-sensitive or high-speed tasks where capabilities align. Opt for Private Cloud Compute for more complex or data-intensive tasks.
  • Implement Error Handling: Consider what happens if the AI returns an unexpected or erroneous output. Can your Shortcut gracefully handle it, or provide a clear indication that human intervention is needed?

CONCLUSION: THE FUTURE OF AUTOMATION WITH APPLE INTELLIGENCE

Apple’s integration of AI models into Shortcuts marks a pivotal moment for personal automation. The “Use Model” action is not just a new feature; it’s a new paradigm for how we interact with our devices and delegate tasks. The experiments in image description and expense tracking demonstrate the immense power that AI brings, transforming previously manual or complex processes into partially automated workflows.

However, these early forays also underscore that AI, in its current form, is a powerful but imperfect tool. It’s not a magic bullet that eliminates the need for human thought or supervision. Instead, it acts as a remarkably capable assistant, one that can significantly accelerate progress but still requires guidance, correction, and a keen eye for detail. The non-deterministic nature of AI means that trust must be built incrementally, and verification remains key.

As AI models continue to evolve and become more sophisticated, the challenges of consistency and reliability will likely diminish. For now, the future of automation with Apple Intelligence is one of exciting potential, demanding creativity in prompt engineering, strategic use of hybrid workflows, and an unwavering commitment to human oversight. It’s a journey into uncharted territory, and while the path may have its bumps, the destination promises a new era of intelligent, personalized productivity.

Leave a Reply

Your email address will not be published. Required fields are marked *